Content Import and Export

You can't determine if a plugin supports dexterity vs archetypes except via its documentation or reading its code.

Don't we need to resolve version conflicting issue as part of this project?

I would think that a final project should not have version conflict problems, so yes.

Can you please elaborate how the final project would not have any version conflict?? As here @djay suggested something different :confused:

@djay , as part of this project
for exporting- we are only providing content in JSON and CSV format and not directory or files like structure??Right?

while importing- What are the possible data types, which could be uploaded?Aren't we only uploading JSON and CSV files, usually exported from other sites?

@djay, @ebrehault, @Shriyanshagro has submitted a draft proposal for this project idea. Would you please take a few minutes to review and offer suggestions for improvement? Thanks very much.

I think we may be having confusion over what exactly we mean by 'version conflict'. I was referring to the difficulty you had above in installing the package due to incompatible version pins for collective.z3cform.datagridfield. I think @djay is referring to the ability to move content between different versions of Plone. The trouble is arising because the word 'version' is being used to refer to two different things here.

Can you explain which of the two your question about version conflicts refers to? Or has my pointing out the difference helped you to understand what I meant when I said the final version of your project should be able to be installed without having to edit the versions.cfg file of a user's buildout?

Oh, I see, it was all misunderstanding.
Anyway, now I have no concern with version.cfg and it's conflicting issue.

But I'm concerned for -

  • Don't we need to resolve version conflicting issue as part of this project?
  • This link states importexport already handles this for Plone(3.6+), does collective.importexport provides this functionality?

I'm still not sure what version conflict you mean. Are you talking about what happens if you export from one site which has a certain schema for your custom dexterity types, into another site where the schema is different?
Part of this project is to have a UI that helps handle this. Where you can be warned that some of your data doesn't match when doing a dry run import to check for this. Also the existing collective.importexport allows for mapping fields in your data to fields in your schemas. This might be able to be extended to json format too with some clever thinking.

Yes there are many existing ways to do export currently. All of them are not end user friendly. The goal here is an end user friendly way that is generically useful to be able to be included in plone core one day hopefully.

The idea is to support either two formats: json + files/folders, or the user can choose CSV + files/folders. If it's just about updating metadata then perhaps you would only need just the CSV or JSON.

Export would have to be via a zip.

Import could be via a zip, or there is html5 ways to upload a directory structure without a zip on some browsers, which might be nice to support.

What if in the other sites not just schema is different but also it's data types i.e. Archetypes(in Plone4)

Yes, I had already included these error handling.

What about permissions for each object data? Does existing collective.importexport handles it?

I still didn't understand this part :confused: what do it contain inside files/folders, we can still provide data in JSON alone, as every other importexport APIs written in python.

Well, I'm looking forward to achieve this goal. :slight_smile:

schema more or less means datatypes. If one kind of type doesn't exist when importing, then yes its something the user needs to be warned about.

I can't remember actually but I think no. I think it might check the right to create a new object but perhaps not to edit. It currently doesn't do things like workflow state etc which have their own permissions.

hmm. good question. Since json is there to support developers I guess it doesn't matter if blob data is embedded. However the CSV format is there to support non developers. And they might like to just upload a bunch of files in dirs with a CSV with metadata and a path field pointing to the files. That is what we would like to support. But ideally they could also do a single json file with a field with a path and a bunch of files in folders.

Oh, great. So now version conflict just comes under Error handling feature, and nothing more than that?
Initially what I was infering that, as a part of project we have to made support for all Plone versions, but actually it's extensively for Plone5 only. Please correct me if I'm wrong here. :slight_smile:

And is it same for export too? I mean, do we give a choice to webmasters for such structure or is it mandatory that every time content got exported, files got separated in a folder and a path field pointing to them, or sometimes do we have to provide embedded data too?

I have looked through -
https://github.com/plone/Products.CMFPlone/issues/1386 and,
https://github.com/plone/Products.CMFPlone/issues/1373 .

And now I understand the the actual length and breadth of this addon and its Use-cases.

I have no more confusion for data formats, permission access, version conflicts or Web UI. :slight_smile:

Questions arises from Proposal review:

  • Do we expect a non-Manager to be importing anything into a site? this could simplify the permission requirement gradually. Refere to this link for complete issue

  • Is there any requirement of user manual for this addon? Refere to this link for complete issue

  • Can there be a possibility of creation of custom content type by webmasters? Refer to this link for complete issue

  • Do we really need CSV format? Refer to this link for complete issue

  • which development model would be best for this project, Agile process? Refer to this link for complete issue

  • Do I also need to write Integration tests and System tests? Or is it already written?

Yes. User with enough permissions at some folder should able to use the import tools to import any content the user has permissions to add normally.

Yes. Pull for Plone documentation repository would be preferred.

Yes. That's a built-in feature of Plone 5 and beyond.

Yes. The reasoning was that JSON is often too technical for non-technical users. But CSV support could have limitations (for data that makes sense to be include in CSV). For content types with binary attachments, CSV support should uploading those binary files with CSV (e.g. in the same ZIP).

Yes. Making a release after each usable feature would reduce risks. Leaving packaging releasing at the end of the project would be risky. Without a working release, there is risk of the work being wasted. Making a release even with partial features of the original goals would allow use of the release by community and finishing the incomplete features later.

Automated tests would preferred. Those test would probably mostly be integration and functional / acceptance tests.

The discussion here sort of invalidates my putting comments in the Gdoc :slight_smile:

I disagree with CSV support... the scope of this project seems big enough to me without it, and there are so many gotchas with CSV formatting. If this tool generates the JSON files and then consumes them, then there is no user-visible aspect to the file format supported. If someone is finicky enough to want to inspect or modify the JSON file then I presume the person is able to handle the more challenging nature of that format.

I also think about my scenarios at large institutions: I would never trust a non-Manager to be importing content from another site, and again it simplifies the project if it doesn't have to handle non-Manager roles and permission levels.

My question in the Gdoc about trying to create missing content types in the destination site is from knowing that some CTs are implemented in Python, not as TTW XML, so I don't see how this import/export tool could recreate the Python implementations of CTs.

...and this supports my other point about assuming Manager role... do you want non-Managers creating content types in a site?

I recall CSV support being a quite high priority in the original idea.

I agree that if non-Manager means more work, it's not a priority. But if your editors can already add new content, how would importing be any different from that?

The capability of importing a large amount of new content might be considered different from manually adding new content... but yeah my point was that non-Manager role implies more testing for cases like the person being able to add new content but not able to create content types, or being able to add Events but not News Items, etc.