Public library export nits

I might be overlooking something, but I do not see a way to export the entire public library on zotero.org. I see that you can select items (up to 25 at a time) and then choose to export them, but I think there should also be a way to export all items (in the whole library or chosen collection) in one go.

The export files, while served with the correct mime type, are named "top" and have no file extension. The files should probably be named after the library they are exported from and have an appropriate extension.

The wikipedia export is served as text/html, which is fine, but that also causes the browser to display the Bookmarks page instead of it being downloaded. I'm guessing that the Content-Disposition header is not being used (which would also explain point #2).

The list of available export types looks quite... hackish. Export formats should be at least capitalized properly and perhaps not use underscores instead of spaces (e.g. rdf_zotero, though in this case, I think Zotero should be in parentheses)
  • There is no way currently to export an entire library from the website. Exports proceed entirely through the Zotero API, so there is not a good way to handle large libraries which can take many requests to get all the data (and the entire operation could stretch to minutes no matter how we did it).

    We could do a better job with smaller collections/libraries, but there are questions to consider for that.

    Among the questions that occur to me right away:

    1. Right now the export is exactly what is selected (and the button is disabled when nothing is selected). Does the default then become as much of the current library/collection as possible, and inform the user if it will be truncated? This seems potentially more confusing to me.

    2. Do we instead offer the export of the entire collection only if the entire thing falls below our threshold? This seems even more mysterious, and is not easily explained to the user on the page.

    3. Are child items included in the export even if they're not displayed in the current view?


    You're correct about the Content-Disposition, again because it is simply an API response. I'm not sure we would want to add that on responses directly from the API, but with upcoming changes in the way the web library makes API requests we may end up wanting to proxy some of these export requests anyway, in which case we could potentially modify the responses there.


    You're also right about the export labels. They should reflect what is in the client.
  • There is no way currently to export an entire library from the website.
    Btw, this is particularly problematic for public libraries with closed membership (and more than 25 items). Without ability to export from the website, I don't think there is any way (besides exporting 25 items at a time) to obtain the said library even though it is public.
    Exports proceed entirely through the Zotero API, so there is not a good way to handle large libraries which can take many requests to get all the data (and the entire operation could stretch to minutes no matter how we did it).
    Thanks for the insight into how this is done. It is now clearer why large exports are problematic. Having to work with the APIs 99 item limit is unfortunate. For flat formats like RIS, BibTeX, etc. you could chain the requests, as you mention, but this does complicate matters a bit for RDF exports (though, I think, currently still possible). I think if this way the limit could be extended to 1000 items, it would cover the majority of use cases.

    I was going to suggest to bypass the API, but I think that relying solely on Zotero API for the website (does it?) is a great way to make sure that Zotero API is sufficiently featureful. Perhaps Zotero could introduce an additional API function for large exports that would allow exporting whole libraries or entire collections. Being a resource-intensive operation, it could be limited to a handful of requests per hour (or maybe based on the number of exported items).
    1. Right now the export is exactly what is selected (and the button is disabled when nothing is selected). Does the default then become as much of the current library/collection as possible, and inform the user if it will be truncated? This seems potentially more confusing to me.

    2. Do we instead offer the export of the entire collection only if the entire thing falls below our threshold? This seems even more mysterious, and is not easily explained to the user on the page.
    Assuming we're working with a 1000 item limit (I don't think this feature is worth considering with something less than that), the export button would be enabled by default and would export the entire library if in library view or the entire collection when in collection view. If there are more than 1000 items in the whole library/collection, the button would be disabled with a mouse-over message saying that there are too many items to be exported. If items are selected, it would change appearance (perhaps add checkboxes to the button to indicate selection) and would export selected items.
    3. Are child items included in the export even if they're not displayed in the current view?
    I don't think file attachment exports need to be considered for this. As far as notes go, I think yes, they should be exported as well. If we're talking about subcollections, then that becomes a bit trickier. I think while collection exports should export the collection that's visible on the website (i.e. no subcollection items). If the website decides to implement recursive collections at some point, then the export should reflect that.
    You're correct about the Content-Disposition, again because it is simply an API response. I'm not sure we would want to add that on responses directly from the API
    I'm not sure that would break anything. These headers are probably always ignored when the response is processed.
  • I was going to suggest to bypass the API, but I think that relying solely on Zotero API for the website (does it?) is a great way to make sure that Zotero API is sufficiently featureful.
    It does, for that reason.
  • Btw, this is particularly problematic for public libraries with closed membership (and more than 25 items). Without ability to export from the website, I don't think there is any way (besides exporting 25 items at a time) to obtain the said library even though it is public.
    For this specific use case, we've talked before about adding the ability to "subscribe" to a public library in the client without being a member.
    I think if this way the limit could be extended to 1000 items, it would cover the majority of use cases.
    We can't do that. One of the main reasons for moving to API syncing is that there will be no more large requests, which make it very hard to provide predictable performance. (I also don't think having a higher limit is helpful if there's a limit at all—there'd still be some libraries where the limit came into play, with the same problems of handling those or communicating the limitation to users.) I'm open to some sort of asynchronous request where API clients polled for results from a long-running, chained-behind-the-scenes action (which could be throttled appropriately and provide backoff guidance), but it can't be done inline. If RDF requires a single operation behind the scenes, even that might be OK, since the translation-server instance could do the throttling internally. The important part is that the client would have to poll for the request rather than get it inline.
    I'm not sure that would break anything. These headers are probably always ignored when the response is processed.
    I think it's more that it seems a bit weird for an API to include a Content-Disposition header with a filename for random API requests. But I guess this could be limited to the export formats, where there generally is an appropriate extension. The filename is a bit stranger, though—should it always be named after the library, even if it might be for a specific, not really nameable (e.g., because it's a search) subset of the library?
  • edited November 29, 2013
    I don't understand the data about API etc sorry.

    I have a bibliography I want to make available online to anyone.
    It has currently 750 listings and will not grow much more. (40 pages single spaced, with one blank line between each listing.)

    a) I want to be able to export it to a document on my local pc, and also put it on a key drive I can keep with me in my briefcase
    Attachments dont matter, just the listings in bibio format.

    b) my students and other public users can download it the same way.

    c) it can be saved and printed in document format - especially for those less comfortable with online resources, and even computers.

    It must be possible for the listings to generate a csv file, and from that a text file. I understand that much because Ive done it with other databases.

    It's really restrictive making it available only online.
  • Given that you just created your account, I'm assuming you don't already have all the bibliographic information in Zotero? Your request a) suggests that you may be misunderstanding how Zotero works, since those are basic functionalities included in the Zotero software.

    As per the discussion above, Zotero currently doesn't offer a good way to allow users to download the entire bibliographic data or print a bibliography from libraries with more than 25 items from the webpage.
    Possibilities of improving that are discussed above.

    If you do make your library public with open membership, everyone can join, sync and have the entire library in her/his Zotero client, which makes it possible to do all the things you list.

    (There will never be a CSV export option, but that's probably just a clunky workaround that you had to use on other pages - CSV is a terrible format for bibliographic data)
  • I'm shocked at the attitudes shown here. I paid to have a researcher input a large library, and only now do I discover that Zotero does not understand the concept of data liberation? That is truly disappointing.
  • Not sure which part you're referring to. The attitude overall is that these things should be possible. Web interface is _not_ the main means of interfacing with Zotero libraries, so it is not a top priority. None of the issues described above are applicable to local clients, so the data is not really locked in. As I said above, the only case this becomes problematic is for closed membership, public libraries.

    Feel free to start a thread describing the limitations you are experiencing and we will try to give you our best guidance.
Sign In or Register to comment.