Diazo: insert external web app into page

leejoramo · December 9, 2015, 1:03am

I am trying to including an existing external python flask application in a Plone 5 site via the Diazo theme. The app is mostly just a search page and that returns an html table. I have Diazo inserting the form into the content area of a plone page, but when submitting the form's post data is not sent on to the flask app.

Here is the Diazo rules I am using:

<rules if-path="/myapp"> <replace css:theme="#content-container" css:content="#mycontent" href="http://example.com/form" /> </rules>

I am not finding any examples of using Diazo to theme external web applications. I do seem to remember that when Diazo was first being demo'ed there being an example of re-themeing a php forum and including it as part of a plone site.

cdw9 · December 9, 2015, 3:42am

What is the action variable on the form set to? You may need to make sure it is an absolute url to the flask app.

djay · December 9, 2015, 4:25am

You have to be careful about doing this as it can result in running out zope instances really easily. Plone is single threaded and synchronous so unless your flask app is really quick then your zope and plone instance is sitting idle unused while its waiting for flask. Since plone uses a lot of RAM is an expensive to scale up a lot of zope instances just to have them idle. I've seen really large sites fall over by doing exactly you are doing.

We did develop one solution around this. Is largely undocumented but I've just released the code. It might be overkill for what you need but in case here is roughly how it works.

You install our middleware in your site. The request will come into plone, you use diazo with server side includes. The middleware will detect the special response and replay the original request (including post data) to your external app. It will then run your server side include rules to pull the relevant parts out of your external app html and deliver the final combined response to the user. This will also set special headers about the authenticated user so plone can handle security.

I just pushed the code to https://github.com/collective/pretaweb.externalapp but no real docs yet.

ebrehault · December 9, 2015, 9:07am

My approach would be to implement it on the frontend using AJAX.
It is quite simple to load a form with javascript, inject it into your page, and on submit get the results and refresh your page.

And if you want to manage the rendering in a unique place (your Plone theme), then turn your Flask app into a REST API backend which sends JSON.

hvelarde · December 9, 2015, 11:28am

I completely disagree with this statement... as with everything, it depends: we have dozens of sites running on cheap DO $5 droplets with memory limited to 384MB and 2 Zope threads without any issues. Large sites are a different story, but most of the problems come from web crawlers hitting all the content all of the time. So I would say Plone uses as many RAM as it needs, depending on the site.

I agree with @ebrehault, it would be better to solve that using AJAX.

djay · December 9, 2015, 11:48am

Please read the whole usecase before replying @hvelarde. I was specifically talking about when you are doing large numbers of requests that include synchronous calls to backend end apis or applications. In that case you can run out of zope processes quickly if the wait time on those backend apis is long.
Under normal circumstances where plone is serving content it works fine.
Using AJAX is of course another solution but has other pros and cons. For example you may not want to expose the backend web app to the web, or deal with the kind of security exchange if it involves authenticated users.

hvelarde · December 9, 2015, 12:09pm

no, @djay, I understood perfectly what you said, but that sentence is false: Plone doesn't use a lot of RAM and is not expensive to add more instances as needed.

you are right in one thing: you have to monitor the number of spare instances. on a perfect world you should always have one spare instance to serve the next request, so your infrastructure fits your site usage and budget.

on large public sites that's never an option, as you can always have the Slashdot effect.

last month, for instance, one of our sites had 10,000 concurrent users at one point. but, as most of them where visiting the same page, CloudFlare, nginx and Varnish made their part and our servers survived.

keul · December 9, 2015, 12:47pm

@djay thanks for sharing this.

leejoramo · December 9, 2015, 8:21pm

Thanks for all of your responses. I am now pretty sure that I was remembering an early demo of Deliverance running as middleware in front of Plone and a php forum.

I did try to quickly setup the solution @djay provided, but I ran into issues just getting Plone & WSGI working. While I am very interested in this method, I can see there would be a lot of work for me to learn how to get this running and then completely tested. I don't think that the site I am building would run into the performance concerns mentioned by Dylan, but I would definitely want to be sure.

So I will go with @ebrehault's suggestion of AJAX.

gyst · December 9, 2015, 8:40pm

Note that using Diazo (Delivernance) as a out-of-Plone frontend XSLT preprocessor (instead of running inside Plone) would also alleviate most of the concerns Dylan raised. It would be e.g. Apache idling, not Zope, and scaling Apache is a well known art.

leejoramo · December 9, 2015, 8:54pm

@gyst What disadvantages are there to running Diazo as a fronted? I can think of:

A little more complex and less standard server configuration
Not using Plone's through the web theming tools

Is there any thing else that is lost?

I guess since Diazo is now pretty much a core technology of Plone, I would be effectively running Diazo twice: within Plone and in front of Plone.

djay · December 10, 2015, 2:08am

The statement was "Zope is expensive to have idle". For serving content the ram is used as a cahce and helps serve content quickly so it is not a waste of resources.
The specific scenario I'm talking about is like this
A popular site has 4 servers with 4 CPUs and 16 Zope instances (each taking 1gb due to amount of content). 70% requests are to a certain page that makes a back end that can take 4s to respond. That means the total capacity of the system for that page is only 4rps (or less once plone builds the page). Unless its balanced to seperate those requests into a different queue those requests would rapidly starve out the system once you get beyond 4rps so not just that page is slow but every other page.
My point was never that Zope is expensive to scale, instead is that, due to its archectcture it is inefficient with regard calling backend APIs.
Btw the above is a real example built by real developers that had no idea why their site was falling over when all the CPUs were at 30%.

Also we are working on another potential solution to this problem by breaking the Zope request into multiple transactions.

gyst · December 10, 2015, 8:12am

Yes my hunch would be to run Diazo twice: once in Apache with only the /myapp edge site include, and then the rest "normally" via plone.app.theming. Not as a recommendation but as an option. You have to run into the kinds of problems Dylan highlights to push you there. AND there has to be a specific reason not to use the AJAX pattern, which is the first thing I would reach for.

dieter · December 10, 2015, 8:30am

We are approaching similar situations (long running backend tasks) by letting them be executed in separate threads and use "dm.zodb.transactional", especially its "scheduler" module, to interface between (Web-) request processing and the separate threads.

The stock "TransactionalScheduler" in "dm.zodb.transactional" keeps information in RAM. This implies that it requires that subsequent requests in the same session must go to the same Zope instance. A derivation would be needed to remove this restriction.

jensens · December 10, 2015, 4:07pm

Ok, I have no quick solution for the problem with blocking the thread. I think its possible to bypass this, at least the blobs are doing so - some research needed.

But with RAM consumption this should be pretty easy:

use requests with streaming
write directly back to Zope Response (scroll down to REQUEST.write)

You may need also prepare two template snippets to stream before and after the content fetched from the external service. (or just render your template and cut it in two pieces at the placeholder.

If you need transformations you need to transform on the stream. Simple URL replacement or body-detection should be no bigger problem. Other transformations can be delegated to Diazo.

djay · December 11, 2015, 3:41am

@gyst , the reason we implemented our own middleware is that apache/nginx to do ssi doesn't replay a POST request. If you are doing GET then you might not need it.

@jensens: thats an interesting solution but are you sure diazo can be applied to a streaming response? If so it would be very nice.
Once caveat with this approach is it only works when zope is exposed directly to the web (or perhaps with special load balancers). HAProxy for example has no mode to tell it to send a new request to an instance on first byte of the response. that the point where the zope transaction should be over and its ready to process a new request. I did contact the author about this but so far no luck. Perhaps if enough others email the author, maybe he will reconsider. You can partially get around either detecting streaming requests to a special queue, or just setting the connection limit for that backend to higher than your zope thread count. We currently get around this for blobs by using collective.xsendfile.

gyst · December 11, 2015, 10:29am

@djay Thanks yes that explains why I abandoned my own early Deliverance experiments.

fredvd · December 11, 2015, 4:52pm

Using diazo outside of Plone can mean two things: as a separate service, or compiling the xslt rules and load them into a front end servers' xslt module.

We did a set up for an old Plone 3 site we didn't want to touch or take responsability for code wise but still had to gvie it a new theme for a marketing campaign, so we set up Diazo as a separate wsgi server inbetween the Plone 3 instance and cache/upstream.

The biggest issue we had with this setup is that you have to serve the /static part of the diazo theme yourself without having /VirtualHostMonster/ rewriting getting in the way. In the previous set up a network reverse proxy in the network would hand over all http requests already rewritten and it was dififcult to untangle those so we asked for the plain request and moved the rewriting to a wsgi middleware in the diazo service. It still runs.

plone.app.theming does this nitty gritty integration for you.