GSoC 2018 Ideas: Improve Plone's import/export story

I just came up with a project idea around collective.transmogrifier improvements... I'm not sure how the two ideas are related or if they really overlap.

@djay @jean @hvelarde Any Idea on how to add the feature in which user has the ability to upload from the different site with url and authentication?
How do we stream data between the sites?
Which protocol would be a good choice to stream big data (large sized BLOB) between sites?

This comment by @djay could be useful to understand it's working over the surface. The more about it can be found in the same thread.

1 Like

@jean @tisto
senaite.sync seems like a good option but isn't this achievable with plone.restapi?
I found in senate.syn documentation that it's built on the top of plone.jsonapi.core package to allow Plone developers to build modern (JavaScript) web UIs which communicate through a RESTful API with their Plone site.
In my knowledge plone.restapi does the same job. Though I see senaite will definitely help in learning to import huge data from a remote site.
@loechel @djay
As I see on the idea page
"Add the ability to perform migration of data between Plone 4 and Plone 5 (optional)"
But plone 4 doesn't even support plone.restapi right? how are we going to accomplish this then??

1 Like

probably not easily. There are a couple of ideas

  1. ensure the csv import works with some of the various plugins that already do csv exports
  2. perhaps support the jsonify format?

But personally I think this requirement is less important. You have the option of an inplace database upgrade which loses no data and a major upgrade is an more technical endevour and there are more technical tools like transmogrifier available.

Having the following is the most important:

  • robust and useful import/export in both json and csv is most important.
  • That supports both import/export without loss and syncing files and/or parts of metadata.
  • Ability to be used by non technical people

If it can do that machine to machine also, then thats a bonus.

1 Like

And of these goals that @djay has laid out, @kakshay21, I would say this one is by far the most important. Making it dead simple to use and fairly close to bomb-proof in function so that a non-technical person has the reasonable expectation that it "just works".

1 Like

Without knowing anything about 'what it does':

Could the export function in ZMI when one chooses XML be used for anything ?

1 Like

@djay @frapell
What was the problem while importing/exporting large sized BLOB?
I mean was it because of loading/processing BLOB in the memory or something else?
Any available backtrack of error would be very helpful.

@frapell @djay @Shriyanshagro
Any ideas, why are these excluded?

If member attribute is not exported then how user's membership or things like portal_membership are still maintained in the new imported site?

No. and I don't understand why the must include attributes are there either.

Go and have a look at collective.importexport and how it was designed.

It is intended to be used so that you can upload any partial set of metadata you want and match with any primary key you want. It asks you which field you want to match on.
The current plone.importexport is inferior and you should imagine something more flexible that this.

1 Like

what was the problem with large sized BLOBs?

At the time of import, user has an option to choose attributes of interest.
In such case there is this set of attributes(here we call them Must Include
Attributes) which are primarily needed for map imported data.

Similarly there are few attributes which are redundant and we decided to
chop them off during export.

Currently, data is loaded onto memory and then goes for further processing.
Hence comes the problem with large BLOB files. We had a discussion to
overcome it, I think @dylan can has better say on this.

1 Like

FYI, @kakshay21, @Shriyanshagro here was the student last year who worked on the first draft of this project. He's a good resource for information about why certain choices were made.

@Shriyanshagro, I"m pretty sure you meant to tag in @djay, who was a mentor last year.

2 Likes

I've drafted my proposal, looking ahead for reviews from mentors and the community.
@loechel @encolpe @djay @espenmn @jean @frapell

Hello, my name is Chinmay Kalegaonkar, I am a Junior at SVNIT Surat,India.
I am proficient in web technologies like the MEAN stack.
I am intermediate in Python.
I would like to contribute to this project for GSoC 2018 , how do I proceed?

Any ideas on choosing support for XML over JSON or vice versa?
@Shriyanshagro @loechel @djay @tkimnguyen @frapell
And after going through week report on last year GSoC, It seems like JSON had some problem with maintaining schema in your case??
I've used Firebase realtime DB with VueJS which only support JSON, and I didn't have any problem with the schema.
Can you elaborate those issues, please?
This way I can conclude which would be best choice between XML and JSON before deadline.

Sorry for capturing this thread but I want to add some points:

plone.restapi provides a rich set of methods for dealing with content imports and a well-defined data format for dealing with data to be imported. plone.restapi will be the standard in Plone 5.2 and therefore all functionality and data formats must match plone.restapi without exception. There are a few cases where the functionality of plone.restapi is weak or non-existing but for the major part of cited usecases, plone.restapi is the way to go and to be considered as the standard.

-aj

3 Likes

Yeah, Actually plone.importexport is built on the top of plone.restapi

CSV and JSON, not XML. Go back over the discussion for the GSOC project from last year. This is to improve on those same ideas.

@djay I'm just speaking of what others have recommended me. I'm aware of CSV which is already implemented. If you see my proposal, you'll get to know about others opinions. Many have recommended going with CSV and XML because XML is easier to serialize instead of going with CSV and JSON. So that's why I was asking about problems with JSON.