GSOC 2019: plone.importexport

##Summary

The ability to import and export content between Plone and other CMS systems was proposed for GSoC 2017. The project was partly completed, but needs to be finished. A proposed UI is in the project issue tracker https://github.com/collective/plone.importexport/issues/15

Currently looks like

Profile
[News Sync             ] [Load] [Save]
   
____|Export|___|*Import*|____

Import File(s)
- *warning* your import is large so will be done in multiple transactions from the browser. Please don't close your browser during upload. Aborting won't be possible.
- a zip containing files in folder structures with a optional index.csv containing metadata (see format), or a single csv metadata file, or DND a folder here
[ /tmp/myzip.zip          ] [browse]

Primary Key
- Field in import metadata to match
[UUID                   ]

Existing content 
- to replace or update or add into
{query widget}
Path: /news
Creation date: > 1/1/20018

If Content Matched
(o) Update 
( ) Replace 
( ) Rename existing 
( ) Rename new 
( ) skip 
( ) Abort

If Content is New
- relative paths will added to the first path found in the query widget
- Content type field must be specified or use content type in query
(o) Add and create folders
( ) Skip if folders don't exist
( ) Skip 
( ) Abort

If Existing doesn't Match 
( ) Remove  
(o) Skip 
( ) Abort

If more than one Match
( ) Remove all except first
( ) Skip all but the first
(o) Abort

Settings imports
[ ] Users (acl_users.csv)
[ ] Themes (portal_resoures/*)
[ ] Registry (portal_registry.csv)
[ ] Generic Setup (portal_setup/*.xml)

[ ] Allow contributors to use this import profile in the Actions menu

[Import] [Dry Run] [Cancel]

Progress: 423/1024 (20s/43s)

383 Items have been updated 
23 items didn't match and skipped (view...)
2 items were added (view...)
10 updates skipped due to permissions
5 adds skipped due to permissions
[Download log]

and export would look like this

Profile
[News Sync             ] [Load] [Save]
   
____|*Export*|___|Import|____

Existing content 
- to export
{query widget}
Path: /news
Path: /other-news
Creation date: > 1/1/20018

Export
[x] Metadata (CSV format)
[x] Files (Zip format)
[  ] Users
[  ] Settings
[  ] Themes 

Metadata to export
( ) All (o) Selected fields
[path, effective_date, title              ]

[Export] [Dry Run] [Cancel]

and includes features like

  • saved profiles
  • users, registry and exporting/importing other settings
  • flexible ways to bulk update or add content
  • handling very large imports using client side restapi calls (javascript)
  • potential AT support and handling converting to dexterity

##Implementation

Complete the work outlined in this issue as well as other issues in the project tracker in GitHub.

##Skills

Python. CSV. some JavaScript would also be welcome.

##Mentors

Dylan Jay, Shriyansh Agrawal (GSoC Student, 2017 and 2018)

##Aims

A fully polished release of plone.importexport available as an installable add-on and possibly accepted into Plone 6 as a core component.

Hey,
I am Uwais Zaki, a sophomore at IIT Roorkee. I have experience in developing web applications using the python frameworks, Django and DjangoRestFramework and javascript frameworks, React. I find your project interesting and would like to participate in GSOC with this project. How do I proceed further?

I think doing the plone training.plone.org is a good start.
I think a good start would be looking at some simple bugs in either plone.importexport or in plone itself. We are commissioning some changes to plone.importexport right now but they should be finished in the next day or so so it would be good to see whats left after that is finished.
Plone has many bugs marked as being suitable for beginners. if you look at its github trackers (there are different ones for different modules).

thinking more probably the best way to start is add some tests once the current work on importexport is finished.

2 posts were split to a new topic: Errors installing plone 5 stable

Hello @djay
I'm Aalekh Jain currently pursuing my B.tech and MS from IIIT-H, India.
I'm an Open Source enthusiast having experience in full stack development (primarily in) Python and Javascript and have used tools or frameworks like Django, Nodejs, Celery, Travis, MongoDB, Selenium to name a few, either in my personal projects or as part of our college's annual TechnoCultural festival. I am also an automation lover and love to automate things, be it on software or on a hardware level.

I have been exploring Plone over the past weeks and also played with it on my local machine. I have also gone through the documentation and training as well and I found Plone to be an exciting organization to contribute to.

I've gone through a couple of threads on this topic and have tried to develop a decent understanding on this, which is to develop an addon which would provide an import/export solution even for non-techy users.

Few of the major challenges that are involved which I could recognize are -

  • Providing an easy to use UI which is built on top of the Plone restapi.
  • Handling permissions on the imported/exported content. (Maybe Plone restapi can handle permission as mentioned in their docs, I'm not fully sure here, correct me if I'm wrong)
  • Developing a mechanism to handle existing content and perform required action on them.
  • Handling large files during import/export.
  • Providing a solution for bulk update of required fields/contents.
  • Integrating useful error logs (if any) along with the import/export tasks.

Kindly guide me to proceed in this project for GSOC'19.

Thanks,
Aalekh Jain
https://github.com/ironmaniiith/

1 Like

Welcome to plone @ironmaniiith. All of that sounds good. But perhaps you need to install and use the current version of plone.importexport to get a feel for how it currently works?
You might want to use one of the later branches as some work has already been done on it.
Note however that the UI uses parts of the restapi internally for serialisation but is not actually connecting via REST itself.

2 Likes

A post was split to a new topic: Installing plone.importexport - conflicting dependencies errors

Hi @djay and @Shriyanshagro
I've uploaded my draft on google summer of code website. Kindly review the draft and suggest required changes.

Thanks

I've merged one of the new PRs with lots of changes on and am now testing to see if this helps it build more easily.
Unfortunately the tests still don't pass.

1 Like

This fix could be included as part of GSoC

Definitely, I've included that in my proposal as well.

Hi @djay
The latest commit solved almost all the problems and there were some minor issues which I was able to solve by following some threads on the community.plone. However, for making the project up and running, I had to install some more packages, which I did by

pip install configparser plone.directives.form plone.recipe.zope2instance==4.0

I tried several zop2instance<4 as well, but only 4.0 worked for me

I tried running import.export addon by exporting a zip file and then importing the same, which also generated a log file. Following is the screenshot attached

Sorry for the late response as I was previously involved in drafting the proposal which took me some time.

Kindly suggest how should I further proceed? Meanwhile I am also following training docs.

Thanks

@ironmaniiith Not sure what you mean by log but there is currently a bug that the path exported is not the same kind as is needed for import. There is a PR to fix that which needs to be reviewed and potentially merged - https://github.com/collective/plone.importexport/pull/34. It makes sense that import and export use relative paths.

Also you should be able to work around it by using UID instead.

Hi,
Thanks a lot for accepting my GSOC'19 proposal. :grinning:

2 Likes

Welcome, Aalekh Jain to the Plone community :slight_smile:
Our coding period will start at the end of this month. In the meantime get yourself familiar with the code base and Pycharm. We may have a few standups before the coding period to freeze the specifications.
Have a lively hack this summer!

1 Like

Welcome!

Getting familiar with how tests work in plone would be useful. Fixing the tests in this branch so we can start from a more stable branch would be good.

Since part of the potential work might involve enhancing the UI with javapscript Iā€™d also look at mockup and robot tests in plone so you are familiar with them.

Also have a look through the current issue list but in particular familiarise yourself with

which is the suggested UI to tie many of the features together and I think would be the content of the first part of the project and then looking at handling larger files after that.

3 Likes

Hi @ironmaniiith

I'm your third (backup) mentor for this project. Welcome to the Plone community. Looking forward to working with you.

One of the main reasons why the tests aren't working is that the zip file is that is used in the tests is corrupted.
Zip file:

If you try to open the zip file or run the tests, it will throw a corrupted file error. You may need to write a hard-coded data structure as the test data or generate a new zip file to test the importer function.

You can find out more about testing Plone at https://docs.plone.org/develop/testing/unit_testing.html#running-unit-tests

my preference for tests is to avoid hard coded test files if its not too much work to do so. Having a zip in the archive doesn't seem like a great idea as it makes it hard to inspect when it comes time to debug tests. Either have

  1. the test setup create the zip from files in the repo
  2. the test setup create the files from scratch and then zip them
  3. create the content in plone, export it to create a the zip and then use that during import tests.

I'd probably go for 3 but some can argue is not isolating export tests from import tests. But there needs to be lots of tests to ensure the export can always be imported so its perhaps pragmatic.

1 Like

Hi,
I'm trying to play with importexport functionality and everytime I change the python code, I have to kill the server and restart. Is there any way it automatically detects changes or I have to keep restarting the server?