Unexpected attached files [was: Attached files replicated, substituted, renamed. Possible sync bug]
I have a situation where the attached files for an item are corrupt. First, if you look at this screen shot http://www.b3sz.com/files/zotero-screen1.png you'll see a few things:
1) There should only be one attached file but Zotero shows 33.
2) The name of the file attachment is "ACM Full Text PDF" which doesn't match the file name. (I only recently found you could rename a file - more on this later.)
3) A search of zotero storage does find 33 instances of a file with this name, each in a different storage directory. (Like storage\4FVMI6MM\Landry - 2009 - Analyzing the London ambulance service's computer.pdf, storage\625HQ7DR\Landry - 2009 - Analyzing the London ambulance service's computer .pdf)
4) The contents of these files are all (mostly) different. Two seem to be the right file. I don't recognize the others.
5) I just searched for files named "ACM Full Text PDF" which should be common because that is the default unless I rename it. There aren't any.
6) A Zotero database check finds no problems.
7) Version is 2.0.9. Running in FireFox 3.6.13. Windows XP SP3.
8) I'm syncing to my own WebDAV server (Apache on Fedora)
One item for consideration: I do most of my work on a desktop but also sync to my laptop. I mark up the PDFs as I read them (highlight, etc.). I know this gets the file contents out of sync and I don't think zotero is intended to handle this. Just the other day I had the idea that I could force uploading the file by renaming it. I think it was this file (Landry - 2009...). I clicked on the title, deleted a space, checked Rename associated file, and saved. Sure enough, the file synched back to my desktop fine. This may have nothing to do with anything, but it smelled close enough to mention. That was 2 days ago.
At least one other item had this same problem, but only 4 replications of the PDF. The file name is different, but the file title in zotero is the same ("ACM Full Text PDF").
I backed up the zotero database. I have access to the WebDAV logs. I don't know the last time the laptop was sync'ed, but not today, so there is an older copy of the database on it.
I'd like to know: a) How can I find all affected items? So far I have only found the 2. b) How do I fix my library? c) How do I prevent it from happening again?
Thanks in advance.
1) There should only be one attached file but Zotero shows 33.
2) The name of the file attachment is "ACM Full Text PDF" which doesn't match the file name. (I only recently found you could rename a file - more on this later.)
3) A search of zotero storage does find 33 instances of a file with this name, each in a different storage directory. (Like storage\4FVMI6MM\Landry - 2009 - Analyzing the London ambulance service's computer.pdf, storage\625HQ7DR\Landry - 2009 - Analyzing the London ambulance service's computer .pdf)
4) The contents of these files are all (mostly) different. Two seem to be the right file. I don't recognize the others.
5) I just searched for files named "ACM Full Text PDF" which should be common because that is the default unless I rename it. There aren't any.
6) A Zotero database check finds no problems.
7) Version is 2.0.9. Running in FireFox 3.6.13. Windows XP SP3.
8) I'm syncing to my own WebDAV server (Apache on Fedora)
One item for consideration: I do most of my work on a desktop but also sync to my laptop. I mark up the PDFs as I read them (highlight, etc.). I know this gets the file contents out of sync and I don't think zotero is intended to handle this. Just the other day I had the idea that I could force uploading the file by renaming it. I think it was this file (Landry - 2009...). I clicked on the title, deleted a space, checked Rename associated file, and saved. Sure enough, the file synched back to my desktop fine. This may have nothing to do with anything, but it smelled close enough to mention. That was 2 days ago.
At least one other item had this same problem, but only 4 replications of the PDF. The file name is different, but the file title in zotero is the same ("ACM Full Text PDF").
I backed up the zotero database. I have access to the WebDAV logs. I don't know the last time the laptop was sync'ed, but not today, so there is an older copy of the database on it.
I'd like to know: a) How can I find all affected items? So far I have only found the 2. b) How do I fix my library? c) How do I prevent it from happening again?
Thanks in advance.
, but a display bug.The label in the central panel of Zotero is not the same as the file nameand, for some reason, the label on the right, which _should_ be the filename, isn't either.If you click on "show file" you'll see that the actual file name corresponds to the article - that's done automatically on import.So the main question to me is how those 33 file attachments appeared. I don't think your re-naming could have caused this, but I'm not sure.
edit: deleted wrong stuff - attachment name vs. file name works as intended
For the 33 items, nobody has ever reported this before. But all Zotero items have date added, for example, so there shouldn't be any mystery as to when they were created. Can you be more specific?
After investigating the other files in detail I've found the cause of the "problem". Now I don't know if it is a bug or just unexpected behavior.
All of the PDFs are from the same conference (SIGMIS-CPR'09). I just reproduced the issue by going to the ACM entry for the Landry paper (doi 10.1145/1542130.1542163) and clicking on the icon in the URL bar to download to zotero. Sure enough, it downloaded all 33 papers in the conference into one conference journal entry for the Landry paper. I didn't know it was possible for zotero to download multiple files for an item this way.
No sync errors, no corruption, no file renaming issues.
If this DOI were an entry for the entire conference proceeding, then I could understand this behavior. But it seems to be a bug to stuff 33 separate conference papers into one journal entry for only one of those papers. So this is a site translator bug?
Are you perhaps seeing a different URL? Or the page content is different? The translator would save multiple PDFs if there were multiple matches for the XPath expression //a[@name="FullTextPdf" or @name="FullTextHtml" or @name="FullText Html"], so if you are seeing multiple such A elements in the page source, this would make sense. But I don't see that on the ACM page, and I don't get the behavior you describe.
ACM supports two views, tabbed and single page view. In the single paged view the table of contents for the publication and links to all of the PDFs are included in the main page.
For example (my raw, proxied URLs):
Tabbed view: http://portal.acm.org.proxy-bc.researchport.umd.edu/citation.cfm?id=42375&preflayout=tabs
Single page view: http://portal.acm.org.proxy-bc.researchport.umd.edu/citation.cfm?id=42375&preflayout=flat
Your last view is sticky, so now this will give you a single page view:
http://portal.acm.org.proxy-bc.researchport.umd.edu/citation.cfm?id=42375
So I must have clicked on the single page view some time ago and it stuck and I didn't notice the difference.
Interestingly, if you choose the tabbed view and click on the table of contents, and then save to Zotero, it prompts for which entry to save. This prompt doesn't happen in the single page view.
It should start working again. If this works for you, please post here so that I can submit this change to be pushed to all users.
Thanks.
My laptop also has "ACM Digital Library.js" that contains "lastUpdated":"2011-02-24 23:30:00". It does not have "ACM.js" either.
The laptop is still on Zotero 2.0.9 and Firefox 3.6.13.