Import .csv in the format generated by export .csv

Importing .csv is a long standing topic. True that csv is not a standard biblio format. But, why can't Zotero import a .csv that was previously generated by its export function?

I have a load of data in spreadsheets that I would like to get into Zotero. The idea is that I export 1-2 items in csv; than I can use that .csv as a template and fill it with my data (this is mostly a straightforward task in Excel, where I can use various spreadsheet text formulas to massage my source data), and import it back into Zotero. This also applies to "Notes".

N.B. Part of the data I am talking about is not bibliographic. Basically, Zotero has awesome functionality in terms of both data structuring and retrieval. There are probably some other programs with similarly rich data structuring capabilities, but I could not find anything comparable in terms of data retrieval, which is equally important. There are also some aspects, which could be improved, but I figured some workarounds to deal with those. For this reason I would like to use Zotero for managing not only my bibliographic data, but also my "vendors" database, as well as some other collections. It would be great if Zotero had some more flexibility to deal with custom data sets, but even in its current state I can use, except that this .csv import feature of the Zotero generated template would be really helpful.
  • There may be other reasons, but a significant reason is that it would be incredibly fragile and fragile in possibly dangerous ways -- e.g. it could systematically import some of the data wrong without a clear indication. It would also with a pretty high likelihood entail a massive amount of incredibly tedious troubleshooting. I very much doubt it's going to happen.

    If someone wants to code a custom import translator that could be distributed quite separately by them either as a .js file (that can be simply added to Zotero) or even as an extension, but that would make very clear that it's not part of Zotero and Zotero wouldn't be responsible for any problems with it or caused by it.
  • One other reason I could think of that zotero items aren't cleanly representable in a simple table - multiple creators, multiple tags, multiple relations are all joined into a single cell in a way that makes sense to humans but isn't unambiguously parsable back.
  • I have exported data from zotero to csv and got and "exported file", Now, I tried to import same file in Zotero. This is something strange.
  • @takan.bhatt: It's not strange at all — as explained above, CSV isn't a real bibliographic format. The export is just a way to get data out in a format that can be used in Excel or similar. If you want to transfer some items between Zotero installations via export, use Zotero RDF.

    If you're trying transfer your library to another computer, don't use export at all — see Transferring a Library.
  • But there should be a facility. When we are talking about DATA Science, "BIG DATA" & ANALYTICS, there is no tool available for bibliometric analysis (not even in R) for local data. When we are analyzing citations, it is observed that researchers did lots of mistakes in citations. We have to correct that data and make it clean. After cleaning these data in EXCEL it needs to again convert into BIB / RIS format for VSOViewer or similar app.
  • You should not clean bibliographic data in Excel. Either clean it in Zotero itself or use the Zotero API (eg, you can access the Zotero API in R using https://github.com/giocomai/zoteror).
  • Mr bwiernik I agree with you. However, I suggest designing for editing record in Zotero. first of all, there should be a sorting and filtering facility so that handling BIG-Data it would be easier. I am not a Coding guy but at hope my suggestion might be acceptable.
  • @dstillman: It's quite strange and misleading I would say. I think it's natural to export your data so you can later import it back, hence it's weird to include an export format that can't be imported. I have exported into CSV before a windows reinstall, and now I can't load it back in and stuck with an empty library.

    At least you could add a warning message like "this data can't be imported, use a different format or use Zotero Sync for library transfer", that would have been really helpful.

    Even the wording is misleading here, as the button says "Export Library...", and above it was said that CSV and other formats can't reliably store library data.
  • Then why not use the API (either the public API or the local client API)? That gets you JSON objects easily digested by analysis tools.

    If you need to have a human in the loop (maybe to make manual selections), RDF provides solid round-trippable export. If you want something more easily digestable in code and that still roundtrips, the Better BibTeX plugin offers JSON export as a side effect that has as explicit intent to be roundtrippable (moreso than RDF technically, but that's mostly on aspects few people will care about about things in the extra field).

    This JSON dump is mostly the format you would get from the public/local API, with some decorations I need for diagnostics, it has been stable for a long time now, but it is technically an internal feature of BBT that is intended for my debugging, so I cannot offer a 100% guarantee that it won't change.
  • edited 9 days ago
    Even the wording is misleading here, as the button says "Export Library..."
    In Excel you can export your workbook to CSV and you will lose all formatting and all sheets except the first. You can export Word documents to PDF and lose all editing facilities. These don't roundtrip. Both are still export.

    As to your request, CSV is just a poor format for nested data; if you wanted something that can safely roundtrip you'd end up with monstrosities like ragged-length rows or JSON in cells, which would be absolutely useless for anything but re-import into Zotero. And for that, there already is RDF.

    CSV simply cannot sensibly generally host all the richness of the source data, which is a problem of the choice of target format, not the problem with the exporter to it. Export is compromises. All Zotero exporters have compromises, even RDF. Export is not backup. You want the unadulterated data, use the API.

    If you can guarantee very rigid limits on what you will allow to be in your Zotero items, it would be possible to have round-trippable CSV with a trivial importer that is built on the assumptions that these particular limits are in place, which you could drop into your translators directory, and it would just work.
  • @emilianoeheyns
    I see your point in exporting does not mean a 100% data transfer, you're right.

    I think it ultimately boils down into a UX question, a generic Zotero user as myself and OP probably, may not be familiar with what data losses might occur on different exporting formats, and for a software like Zotero, where the main service is storing data in a critical structure, exporting/transferring data seems like a key aspect, which, in my opinion, should be better presented to the user.
  • Word does not warn you if you export your document to PDF. And the data is not lost. It is just not in a particular export, and that's pretty hard to miss. I don't see Zotero prompting for all exports formats "this is not a backup".

    But as I said, if you want it to roundtrip, and you control the way it is being used, roundtrip CSV is entirely possible. Just not out of the box.
Sign In or Register to comment.