Sorting items list: Out of memory

The Debug ID is D95542551.

It looks like an attempt to load and sort the entire database in memory.
  • Sorting items list […] for 372701 items
    Just to clarify, do you intentionally have this many items in one library? (This includes child items.)
  • edited September 16, 2019
    Yes, the books in this group are interconnected, have a common numbering and common tags. There are no subordinate elements, only the registration data of the book, as in the old library card index. This is a normal, real, not a toy situation.

    And what, Zotero can work only with databases that are completely fit in memory?

    Yes, sqlite has cursors for iteration of recordset only in the forward direction, but this does not mean that you should generate representations of elements outside the window.
  • This is a normal, real, not a toy situation.
    That's fine. You just have to understand that you're at the far extreme end of library sizes that people store in Zotero, and you therefore may run into problems that other people don't see.
    And what, Zotero can work only with databases that are completely fit in memory?
    Not entire databases, but certain fields, yes.

    SQLite doesn't really have anything to do with it. In the current implementation, various transformations have to be made on data in the database before it can be displayed in the items list, and the version of SQLite we have access to doesn't have full Unicode collation support, so SQLite sorting doesn't come into play.

    In the future, we may be able to rely more on SQLite sorting, which could help with memory issues, but there are various downsides, including increased database size and likely worse performance for (vastly more common) smaller databases.

    Out of curiosity, how much RAM does this computer have?
  • edited September 17, 2019
    The answer to the last question: 32GB.
    Unfortunately, in Windows your application is 32bit, so the maximum observed memory footprint was about 3.2GB. Page Fault Rate - 4,000-90,000 per second, w/o disk activity.

    I just draw your attention to the bottlenecks of your program, which I really appreciate and try to make it more perfect. For example: next bottleneck is "Duplicate items".
    Please do not consider my remarks as criticism for the sake of criticism.
    I am ready to provide you with all possible assistance in my free time.

    For my personal purposes, I would be glad if you just released the 64 bit version for Windows 10.
  • edited September 17, 2019
    Just FYI:

    Define New Collating Sequences (during database connection)
    http://www.sqlite.org/c3ref/create_collation.html

    Possible UNICODE LIKE, upper(), lower() function solution
    https://www.mail-archive.com/sqlite-users@mailinglists.sqlite.org/msg30403.html
    https://github.com/Zensey/sqlite3_unicode

    Of course, there are also date and numeric values.

    Most common solution: temporary (per session) materialized views for most of fields (and for items list entirely), pre-fomatted, with pre-calculated ordering keys.
    For example: "Creator" field is first candidate to pre-format and store in a temp table.
Sign In or Register to comment.