[SOLVED] Pp.server Plone and page index

I use pp.server.plone and phantomjs to generate PDFs for a 'book'.

After upgrading the server, for unknown reasons the book view splits images on two pages, while the chapter view does not. They use the same CSS file.

The book view is just a tal repeat, so I find it quite odd, especially since I tried with both the templates and CSS from a year ago and it 'does the same':

<div tal:define="obj item/getObject;" tal:replace="structure obj/@@ebook_chapter_view | nothing" />
  1. Any suggestion on how to troubleshoot this ?

  2. Another approach would be to download each chapter. If so, I would need the page numbering to start on 'a set page'. Does anyone know if that is possible with phantomjs (or any other)?

Book view top (repeat loop of chapter view) and chapter view bottom

..because PhantomJS never was and never will be reasonable PDF renderer. If you are using pp.server, try using weasyprint as free PDF renderer alternative. weasyrender complies with most of the CSS Paged Media standard. Another alternative would be PageJS (required installation of pagedjs-cli) which is properly the best longterm solution...but PhantomJS is clearly a candidate for the trashcan.
Latest pp.server==2.1.0 comes with extended support for PageJS and typeset.sh converters.

When I set up this, I only managed to get (Latex) math(jax) to work with phantomjs. Do you think that will work with any of the alternatives ?

Also: does any of the alternatives have an option for 'start page numbering': It would be nice if I did not have to export the whole book when there is just a small change in one chapter


phantomjs before thursday:
http://www.marfag.no/k12/@@download/bok_som_pdf

Phantomjs after friday: http://www.marfag.no/k12/asPDF?converter=phantomjs&resource=resources_ebok&template=bok_template.pt

sorry, no idea. you have to try it out yourself.

Not sure if still usable with newer versions of pp.server but we use wkhtmltopdf 0.12.3 (with patched qt) with pp.server 1.0.7.1 and pp.client_plone-0.4.4 - this lets you use separate customized headers and footers for which you could provide parameters in the querystring.

Sample javascript in a header:

        <script>
            function getQueryParms() {
                //  wkhtmltopdf provides
                // 'page', 'frompage', 'topage', 'webpage', 'section', 'subsection', 'date',
                // 'isodate', 'time', 'title', 'doctitle', 'sitepage', 'sitepages'
                var wk_parms = {'page':'', 'topage':''};
                var url_parms =  document.location.search.substring(1).split('&');
                for (var url_parm in url_parms) {
                    var temp_var = url_parms[url_parm].split('=', 2);
                    if (temp_var[0] in wk_parms) {
                        wk_parms[temp_var[0]] = decodeURI(temp_var[1]);
                    };
                };
                // provide content to HTML elements with class names equalling wk_parms
                // e.g. <p>Page <span class="page"></span> of <span class="topage"></span></p>
                for (var css_class in wk_parms) {
                    var element = document.getElementsByClassName(css_class);
                    // console.log('processing', css_class);
                    // console.log('length',css_class,element.length);
                    for (var j = 0; j < element.length; ++j) {
                        element[j].textContent = wk_parms[css_class];
                    };
                };
                // header with large logo only on page 1
                if (wk_parms['page'] != 1) {
                    var element = document.getElementById('tmb-header-first-page');
                    if (element) {
                        element.parentNode.removeChild(element);

                    };
                };
                // header with small logo only on next pages
                if (wk_parms['page'] == 1) {
                    var element = document.getElementById('tmb-header-next-pages');
                    if (element) {
                        element.parentNode.removeChild(element);
                    };
                };
                // remove spacer from last page
                if (wk_parms['page'] == wk_parms['topage']) {
                    var element = document.getElementById('tmb-footer-spacer');
                    if (element) {
                        element.parentNode.removeChild(element);
                    };
                };
                // footer with producer info only on last page
                if (wk_parms['page'] != wk_parms['topage']) {
                    var element = document.getElementById('tmb-footer-last-page');
                    if (element) {
                        element.parentNode.removeChild(element);
                    };
                };
            };
        </script>

Happy to see that my software is still being used :hugs:

1 Like

Thanks a lot.

From Desktop Publishing software (I think) section is 'page number to start on'. Is it the same here?

Footers And Headers:
Headers and footers can be added to the document by the --header-* and
--footer* arguments respectively. In header and footer text string supplied
to e.g. --header-left, the following variables will be substituted.

  • [page] Replaced by the number of the pages currently being printed
  • [frompage] Replaced by the number of the first page to be printed
  • [topage] Replaced by the number of the last page to be printed
  • [webpage] Replaced by the URL of the page being printed
  • [section] Replaced by the name of the current section
  • [subsection] Replaced by the name of the current subsection
  • [date] Replaced by the current date in system local format
  • [isodate] Replaced by the current date in ISO 8601 extended format
  • [time] Replaced by the current time in system local format
  • [title] Replaced by the title of the of the current page object
  • [doctitle] Replaced by the title of the output document
  • [sitepage] Replaced by the number of the page in the current site being converted
  • [sitepages] Replaced by the number of pages in the current site being converted

https://wkhtmltopdf.org/usage/wkhtmltopdf.txt

As featured here: Product lifecycle management with Plone and Product Lifecycle Management with Plone

1 Like

I discovered two things that can mess things up.

  1. Timeout on some linked resources.
  2. Any iframe or similar from external sources.

In my case splitting one chapter into two (smaller) and copying a resource to 'locally' worked.

In my case, the 'Norwegian Departement of Education (or whatever their name is in english) was set to make digital books etc some years ago. After quite some time the maritime schools that I had as customers got fed up with waiting and asked me to fix something for them. With markdown, pp.server and quite a bit of tricks we produce online versions, pdfs, ebooks of great quality.

The 'Official departments' has not managed to come up with anything useful yet, there were even 'corruption charges' since employees had 'given work to themselves'.

So the solution, with pp.server still does a much better job than anything 'official', even if they have spent MANY years and millions of Euros.

1 Like

I have some time, so I took a look at this:

Do you know if it is possible to change orientation (to Landscape) in a javascript or (if that does not work) the default config of wkhtmltopdf. I have been trying to find out where the default settings for wkhtmltopdf are stored.

Would be nice if it was possible to change orientation from for example a setting on the content

See the usage documentation link above. There are lots of configuration parameters. One is -O, --orientation <orientation> Set orientation to Landscape or Portrait (default Portrait)

In our implementation, those parameters are passed to pp.client/browser/pdf.py as "cmd_options" and used in call2()

Sorry, I am a bit confused here.
I know that you can set the orientation from the command line, but is this possible to do from javascript, for example to get page 1 landscape and page 2 portrait ?

About passing as parameters, how exactly do you do that?
Is it possible to do something like:

 http://path/to/page/asPDF?converter=wkhtmltopdf&orientation=landscape

Somewhat, yes. It appears that I customized pdf.py a bit to make it accept querystring parameters and pass those on to the converter.

See here: https://bitbucket.org/ajung/pp.client-plone/src/882df32395f26d8abce6fce7a6244647645f1a71/pp/client/plone/browser/pdf.py#lines-242

From my notes:

pp.client.python pdf takes the following arguments: source_directory, converter='princexml', output='', async=False, cmd_options='', server_url='http://localhost:6543', authorization_token=None, ssl_cert_verification=False, verbose=False

pp.server converter.py takes the following arguments: work_dir, work_file, converter, cmd_options, source_filename='index.html'

Which led me from
result = pdf.pdf(destdir, converter, server_url=server_url, ssl_cert_verification=True)
to:
result = pdf.pdf(destdir, converter, cmd_options=cmd_options, server_url=server_url, ssl_cert_verification=False)

Are these options you use internally, or just 'passed on' ? Is something like this currently possible ?

http://path/to/page/asPDF?converter=wkhtmltopdf&cmd_options='-O landscape'

( My memory went with Covid… )

See __call2__() of browser/pdf.py

Thanks.

I have not had the time to test yet, but in the meantime I discovered that it is actually possible to rotate everything else instead (with CSS).

Just a test, but using two different templates (and rotating #main-content in one of them can render same content as

http://pdf.medialog.no/prospekter/hjemmeveien-12/asPDF?converter=phantomjs&resource=resources_pdf&template=pdf_template.pt

and

http://pdf.medialog.no/prospekter/hjemmeveien-12/asPDF?converter=phantomjs&resource=resources_pdf&template=pdf.pt

( wkhtmltopdf does not look good yet http://pdf.medialog.no/prospekter/hjemmeveien-12/asPDF?converter=wkhtmltopdf&resource=resources_pdf )

Both wkhtmltopdf and phantomjs are nowadays completely outdated.

The "modern" and "free" tools are weasyprint or PagedJS. With PagedJS you can use also features like Grid of Flexbox for design.

All these tools are supported by the most recent releases of pp.server and pp.client-python.

However I am not sure if they are compatible with pp.client-plone (which is more or less unmaintained and outdated)..

True, but but they still work great (?) So far I have managed all I have tried.

PS: I have no real use case at the moment, just wanted to make something to show to potential customers.