[SOLVED] Pp.server Plone and page index

espenmn · August 24, 2020, 9:37am

I use pp.server.plone and phantomjs to generate PDFs for a 'book'.

After upgrading the server, for unknown reasons the book view splits images on two pages, while the chapter view does not. They use the same CSS file.

The book view is just a tal repeat, so I find it quite odd, especially since I tried with both the templates and CSS from a year ago and it 'does the same':

<div tal:define="obj item/getObject;" tal:replace="structure obj/@@ebook_chapter_view | nothing" />

Any suggestion on how to troubleshoot this ?
Another approach would be to download each chapter. If so, I would need the page numbering to start on 'a set page'. Does anyone know if that is possible with phantomjs (or any other)?

Skjermbilde 2020-08-24 kl. 11.34.551474×950 69.2 KB

Book view top (repeat loop of chapter view) and chapter view bottom

zopyx · August 24, 2020, 1:56pm

..because PhantomJS never was and never will be reasonable PDF renderer. If you are using pp.server, try using weasyprint as free PDF renderer alternative. weasyrender complies with most of the CSS Paged Media standard. Another alternative would be PageJS (required installation of pagedjs-cli) which is properly the best longterm solution...but PhantomJS is clearly a candidate for the trashcan.
Latest pp.server==2.1.0 comes with extended support for PageJS and typeset.sh converters.

espenmn · August 24, 2020, 2:34pm

When I set up this, I only managed to get (Latex) math(jax) to work with phantomjs. Do you think that will work with any of the alternatives ?

Also: does any of the alternatives have an option for 'start page numbering': It would be nice if I did not have to export the whole book when there is just a small change in one chapter

phantomjs before thursday:
http://www.marfag.no/k12/@@download/bok_som_pdf

Phantomjs after friday: http://www.marfag.no/k12/asPDF?converter=phantomjs&resource=resources_ebok&template=bok_template.pt

zopyx · August 24, 2020, 2:49pm

sorry, no idea. you have to try it out yourself.

mtrebron · August 24, 2020, 5:30pm

Not sure if still usable with newer versions of pp.server but we use wkhtmltopdf 0.12.3 (with patched qt) with pp.server 1.0.7.1 and pp.client_plone-0.4.4 - this lets you use separate customized headers and footers for which you could provide parameters in the querystring.

Sample javascript in a header:

        <script>
            function getQueryParms() {
                //  wkhtmltopdf provides
                // 'page', 'frompage', 'topage', 'webpage', 'section', 'subsection', 'date',
                // 'isodate', 'time', 'title', 'doctitle', 'sitepage', 'sitepages'
                var wk_parms = {'page':'', 'topage':''};
                var url_parms =  document.location.search.substring(1).split('&');
                for (var url_parm in url_parms) {
                    var temp_var = url_parms[url_parm].split('=', 2);
                    if (temp_var[0] in wk_parms) {
                        wk_parms[temp_var[0]] = decodeURI(temp_var[1]);
                    };
                };
                // provide content to HTML elements with class names equalling wk_parms
                // e.g. <p>Page <span class="page"></span> of <span class="topage"></span></p>
                for (var css_class in wk_parms) {
                    var element = document.getElementsByClassName(css_class);
                    // console.log('processing', css_class);
                    // console.log('length',css_class,element.length);
                    for (var j = 0; j < element.length; ++j) {
                        element[j].textContent = wk_parms[css_class];
                    };
                };
                // header with large logo only on page 1
                if (wk_parms['page'] != 1) {
                    var element = document.getElementById('tmb-header-first-page');
                    if (element) {
                        element.parentNode.removeChild(element);

                    };
                };
                // header with small logo only on next pages
                if (wk_parms['page'] == 1) {
                    var element = document.getElementById('tmb-header-next-pages');
                    if (element) {
                        element.parentNode.removeChild(element);
                    };
                };
                // remove spacer from last page
                if (wk_parms['page'] == wk_parms['topage']) {
                    var element = document.getElementById('tmb-footer-spacer');
                    if (element) {
                        element.parentNode.removeChild(element);
                    };
                };
                // footer with producer info only on last page
                if (wk_parms['page'] != wk_parms['topage']) {
                    var element = document.getElementById('tmb-footer-last-page');
                    if (element) {
                        element.parentNode.removeChild(element);
                    };
                };
            };
        </script>

zopyx · August 24, 2020, 6:05pm

Happy to see that my software is still being used

espenmn · August 24, 2020, 8:31pm

Thanks a lot.

From Desktop Publishing software (I think) section is 'page number to start on'. Is it the same here?

mtrebron · August 25, 2020, 10:56am

Footers And Headers:
Headers and footers can be added to the document by the --header-* and
--footer* arguments respectively. In header and footer text string supplied
to e.g. --header-left, the following variables will be substituted.

[page] Replaced by the number of the pages currently being printed
[frompage] Replaced by the number of the first page to be printed
[topage] Replaced by the number of the last page to be printed
[webpage] Replaced by the URL of the page being printed
[section] Replaced by the name of the current section
[subsection] Replaced by the name of the current subsection
[date] Replaced by the current date in system local format
[isodate] Replaced by the current date in ISO 8601 extended format
[time] Replaced by the current time in system local format
[title] Replaced by the title of the of the current page object
[doctitle] Replaced by the title of the output document
[sitepage] Replaced by the number of the page in the current site being converted
[sitepages] Replaced by the number of pages in the current site being converted

https://wkhtmltopdf.org/usage/wkhtmltopdf.txt

mtrebron · August 25, 2020, 11:01am

As featured here: Product lifecycle management with Plone and Product Lifecycle Management with Plone

espenmn · September 2, 2020, 10:09am

I discovered two things that can mess things up.

Timeout on some linked resources.
Any iframe or similar from external sources.

In my case splitting one chapter into two (smaller) and copying a resource to 'locally' worked.

espenmn · September 2, 2020, 10:18am

In my case, the 'Norwegian Departement of Education (or whatever their name is in english) was set to make digital books etc some years ago. After quite some time the maritime schools that I had as customers got fed up with waiting and asked me to fix something for them. With markdown, pp.server and quite a bit of tricks we produce online versions, pdfs, ebooks of great quality.

The 'Official departments' has not managed to come up with anything useful yet, there were even 'corruption charges' since employees had 'given work to themselves'.

So the solution, with pp.server still does a much better job than anything 'official', even if they have spent MANY years and millions of Euros.

espenmn · May 18, 2021, 4:38pm

I have some time, so I took a look at this:

Do you know if it is possible to change orientation (to Landscape) in a javascript or (if that does not work) the default config of wkhtmltopdf. I have been trying to find out where the default settings for wkhtmltopdf are stored.

Would be nice if it was possible to change orientation from for example a setting on the content

mtrebron · May 18, 2021, 6:54pm

See the usage documentation link above. There are lots of configuration parameters. One is -O, --orientation <orientation> Set orientation to Landscape or Portrait (default Portrait)

In our implementation, those parameters are passed to pp.client/browser/pdf.py as "cmd_options" and used in call2()

espenmn · May 18, 2021, 7:20pm

Sorry, I am a bit confused here.
I know that you can set the orientation from the command line, but is this possible to do from javascript, for example to get page 1 landscape and page 2 portrait ?

About passing as parameters, how exactly do you do that?
Is it possible to do something like:

 http://path/to/page/asPDF?converter=wkhtmltopdf&orientation=landscape

mtrebron · May 18, 2021, 8:16pm

Somewhat, yes. It appears that I customized pdf.py a bit to make it accept querystring parameters and pass those on to the converter.

See here: https://bitbucket.org/ajung/pp.client-plone/src/882df32395f26d8abce6fce7a6244647645f1a71/pp/client/plone/browser/pdf.py#lines-242

From my notes:

pp.client.python pdf takes the following arguments: source_directory, converter='princexml', output='', async=False, cmd_options='', server_url='http://localhost:6543', authorization_token=None, ssl_cert_verification=False, verbose=False

pp.server converter.py takes the following arguments: work_dir, work_file, converter, cmd_options, source_filename='index.html'

Which led me from
result = pdf.pdf(destdir, converter, server_url=server_url, ssl_cert_verification=True)
to:
result = pdf.pdf(destdir, converter, cmd_options=cmd_options, server_url=server_url, ssl_cert_verification=False)

espenmn · May 19, 2021, 2:25pm

Are these options you use internally, or just 'passed on' ? Is something like this currently possible ?

http://path/to/page/asPDF?converter=wkhtmltopdf&cmd_options='-O landscape'

( My memory went with Covid… )

zopyx · May 19, 2021, 2:53pm

See __call2__() of browser/pdf.py

espenmn · May 19, 2021, 4:14pm

Thanks.

I have not had the time to test yet, but in the meantime I discovered that it is actually possible to rotate everything else instead (with CSS).

Just a test, but using two different templates (and rotating #main-content in one of them can render same content as

http://pdf.medialog.no/prospekter/hjemmeveien-12/asPDF?converter=phantomjs&resource=resources_pdf&template=pdf_template.pt

and

http://pdf.medialog.no/prospekter/hjemmeveien-12/asPDF?converter=phantomjs&resource=resources_pdf&template=pdf.pt

( wkhtmltopdf does not look good yet http://pdf.medialog.no/prospekter/hjemmeveien-12/asPDF?converter=wkhtmltopdf&resource=resources_pdf )

zopyx · May 19, 2021, 5:11pm

Both wkhtmltopdf and phantomjs are nowadays completely outdated.

The "modern" and "free" tools are weasyprint or PagedJS. With PagedJS you can use also features like Grid of Flexbox for design.

All these tools are supported by the most recent releases of pp.server and pp.client-python.

However I am not sure if they are compatible with pp.client-plone (which is more or less unmaintained and outdated)..

espenmn · May 19, 2021, 6:03pm

True, but but they still work great (?) So far I have managed all I have tried.

PS: I have no real use case at the moment, just wanted to make something to show to potential customers.