"New Item" → "Web Page" option for citing offline webpages

  • I just wanted to chime in that I have to manually add websites frequently (when I say frequently, I have manually created hundreds if not more than a thousand webpage entries), so I'd appreciate if this was available in the dropdown. It's not the biggest deal, but having to click a few extra times to change the manually created item to a web page is a pain when I'm doing it so frequently.

    For context, I am collecting utility data, so when the tariffs are updated annually the old webpages are intentionally taken offline (and hence Zotero connector will not work).

    In these cases, I believe citation best practices indicate that I should put the broken URL and the date that I accessed the URL (when it was working). At the moment I believe has to be done manually, and I would appreciate support for this use case.

    I just want to add in case there are concerns about having a record of broken URLs: I have manually archived the data, and it is available on the Wayback Machine, but I still believe best practice is to cite the broken URL and not the Wayback Machine (as the broken URL was the true source of the information).
  • @fletchapin:
    I believe citation best practices indicate that I should put the broken URL and the date that I accessed the URL (when it was working)
    FWIW, there's definitely not a consensus that that's a best practice — various style guides (e.g., MLA, CMOS (discussed in a previous thread)) say to use the Wayback Machine URL when citing offline pages.

    I guess you're saying you're recording data from the page while it's online and only later creating Zotero entries (?), which is perhaps a bit of a distinct situation, but regardless, you're already including the bibliographic details that establish attribution. The point of the URL is to let the reader view the information themselves if they want, and a trustworthy archived URL lets them do that.

    We've long said that an offline page is more or less the sole legitimate reason for manually creating webpage items, but if the page is available in the Wayback Machine, I would recommend starting with that.
  • Yes, there are times whenI have downloaded the PDF/HTML and forget to create the citation at that point, then when I go back to cite it later and the URL is down. In those situations, I agree that was my mistake as I should have captured the webpage in Zotero at the time I accessed the data.

    However, in other cases utilities re-use the same URLs, so the webpage is still live but the metadata is now outdated. In those cases, the Zotero connector would give me incorrect information, but the Wayback Machine is also not necessary as the webpage is still live.

    Finally, from what I found online, it appeared that when you do use Wayback Machine, you should cite both the original URL and Wayback Machine URL: https://help.archive.org/help/using-the-wayback-machine/. This also agrees with the MLA style guide: https://style.mla.org/citing-work-wayback-machine/

    In the latter two cases, I believe that it is always necessary to create a manual Web Page entry. For example if I try to create the Zotero entry on the Wayback Machine automatically I get the following metadata, which does not include any information about the original URL:

    https://s3.amazonaws.com/zotero.org/images/forums/u8581932/bktwwin9z91ieqp7skp1.png
  • edited 2 days ago
    Following off joncto's comment in the linked thread, if the concern is that folks should not be creating Web Pages manually, would it be possible to add a new resource type called "Web Archive Resource" that is listed in the manual dropdown and has fields for both the original URL and archive URL?

    This would have the dual benefit of standardizing this confusing process of citing archived websites, while also saving folks like me who have to manually cite a lot of those resources time.
  • edited yesterday at 4:12am
    Finally, from what I found online, it appeared that when you do use Wayback Machine, you should cite both the original URL and Wayback Machine URL: https://help.archive.org/help/using-the-wayback-machine/. This also agrees with the MLA style guide: https://style.mla.org/citing-work-wayback-machine/
    That MLA page (which I also linked to) specifically shows using the Wayback Machine URL, not the original URL.

    Again, there's really no benefit to the reader to include a broken link. It's trivial to extract an original URL from a Wayback Machine URL if you really want to. It's much harder to take an original URL and navigate to a specific access date in the Wayback Machine or some other archive.

    Saving from the Wayback Machine will get you the correct archive URL. Other data will vary — site-specific Zotero translators won't be used, but any embedded metadata will be, so you'll usually at least get Title, Date, and Accessed, and you might get authors or other fields. It's better than typing everything from scratch.
  • If you looked at the Wayback Machine link I shared, they got this answer directly from MLA:

    "This question is a newer one. We asked MLA to help us with how to cite an archived URL in correct format. They did say that there is no established format for resources like the Wayback Machine, but it’s best to err on the side of more information. You should cite the webpage as you would normally, and then give the Wayback Machine information. They provided the following example: McDonald, R. C. “Basic Canary Care.” _Robirda Online_. 12 Sept. 2004. 18 Dec. 2006 [http://www.robirda.com/cancare.html]. _Internet Archive_. [ http://web.archive.org/web/20041009202820/http://www.robirda.com/cancare.html]. "

    This is why on the MLA website it says "Include the archived web page’s information in container 1 of your entry". Therefore, it is necessary to cite both the original URL metadata as container 1, and also the Wayback Machine as container 2.

    As far as I could tell, the current Web Page manual entry does not represent both containers, and therefore I concur with joncto that a new "Web Resource Archive" would be useful to capture the metadata for both container 1 and 2.

    Imagine if the Wayback Machine goes down as well. It is just another website after all. In that case, if I cited just container 2 (the Wayback Machine) as you are recommending, then the original web page's metadata is lost to the world. I would much rather have the meaningful metadata about the original web page (container 1) than the metadata about the Wayback Machine (container 2).

    In conclusion, does Zotero currently support citing two websites, one as container 1 and one as container 2? I don't believe it does, so that's why I seconded joncto's request for a new resource type called "Web Archive Resource".
  • I don't know why you'd think MLA would post an incorrect/incomplete example on their website. All examples they provide are complete. They (quite reasonably) changed their guidance on citing archived website to only include a single URL, the archived one since IA last asked them.

    Chicago Manual concurs for the Wayback machine (14.104 has
    7. “Academics,” Howard University, archived October 19, 2023, at https://web.archive.org/web/20231019175606/https://howard.edu/academics.
    Chicago (also reasonably) suggest including both URL and archived URL when using a service like perma.cc where the archived URL doesn't include the original one, but that's a) pretty rare and b) can probably be done using the "Archive:" in Extra if need be. A dedicated item type is overkill & almost certainly not going to happen.
  • At the very least, I think you'd agree that there is very different guidance on how to cite archived web pages than normal web pages, or else this thread would not exist.

    Following off the fact that web archives are distinct from normal web pages in terms of citation style, why do you think a dedicated item is overkill? There are numerous niche Zotero resource types, such as "Instant Message", "Forum Post", etc.
Sign In or Register to comment.