Is it somehow possible to take a screensthot of all pages (etc) in a Plone site (automatically)?
I guess you can with puppeteer if you have plone.restapi installed (or otherwise preparing a custom view that extracts the content you want to screenshot), but I never tried.
Something like this (untested):
const puppeteer = require('puppeteer');
(async () => {
const browser = await puppeteer.launch();
const testPage = await browser.newPage();
const pages = await fetch(/* get pages */);
Promise.all(
pages.map(async page => {
await testPage.goto(page.url);
await testPage.screenshot({path: `pages/${page.title}`});
});
);
await browser.close();
})();
Thanks a lot. I did not get that to run, but it turns out there is a python port. Looks very promising… looks cool
I got this to work (with Python 3.7( … will check if it is possible to run it from Plone
import asyncio
from pyppeteer import launch
async def main():
browser = await launch(headless=True)
page = await browser.newPage()
await page.setViewport({'width': 1700, 'height': 4400})
await page.goto('http://medialog.no')
await page.screenshot({'path': 'example.png'})
await browser.close()
asyncio.get_event_loop().run_until_complete(main())
There is a simple solution: Just download http://yoursite.com/sitemap.xml.gz
Do a quick searh and replace (or use lxml / etree) and use that list for 'pages'. Note: I opened and closed the 'browser' to avoid 'some kind of timeout error'.
The script requires Python 3.5+ (and I am not sure how to deal with 'asyncio') but it should be possible to make it as a browser view if / when the plone sites uses python 3
import asyncio
from pyppeteer import launch
#from lxml import etree
pages = ['http://www.medialog.no', 'http://plone.org' ']
async def main():
a = 1
for webpage in pages:
browser = await launch(headless=True)
page = await browser.newPage()
await page.setViewport({'width': 1700, 'height': 4400})
await page.goto(webpage)
pagename = 'medialog' + str(a) + '.png'
a = a + 1
await page.screenshot({'path': pagename})
await browser.close()
asyncio.get_event_loop().run_until_complete(main())
Maybe not the best code, but should work. just change my_url to your plone site
import urllib.request
import gzip
from lxml import etree
import asyncio
from pyppeteer import launch
import os
async def main():
my_url = "http://skipet.medialog.no/sitemap.xml.gz"
sitemap = urllib.request.urlopen(my_url)
sitemap_data = gzip.decompress(sitemap.read())
parser = etree.XMLParser(remove_blank_text=True)
elem = etree.XML(sitemap_data, parser=parser)
a = 1
print(len(elem))
for element in elem:
webpage = element[0].text
pagename = webpage.replace("http://", "")
pagename = pagename.replace(".", "-")
pagename = pagename.replace("/", "-") + '.png'
a = a + 1
#import pdb; pdb.set_trace()
exists = os.path.isfile(pagename)
if exists:
print('done before: {0}'.format(pagename))
else:
# make preview
browser = await launch()
page = await browser.newPage()
await page.setViewport({'width': 1700, 'height': 2400})
await page.goto(webpage)
await page.screenshot({'path': pagename})
await page.close()
await browser.close()
print('saved screenshot: {0}'.format(pagename))
return a
asyncio.get_event_loop().run_until_complete(main())