MetaData Tools
I searched MetaData because I have pdfs which may have metadata on the internet, but not in the document. I want to use metadata lookup to add the metadata to my homemade pdfs.
This would be if I had to copy the text from the website, put it in word and then create a pdf. the website offered a pdf but it was full of junk. So maybe that pdf has metadata and I want to add it to my pdf.
I have a program that allows me to add metadata to the pdf. (BeCyPDFMetaEdit). I want to know, what data do these metadata servers request? Can I use whatever DOI or ISBN the pdf uses? or do I need all of the title and author, etc.? by which time I have simply entered the metadata myself.
I also encounter a lot of pdfs in the non-academic quarters (government, business and non-profits) which do not even trouble to get a DOI. Who issues these things anyway, and can we request DOIs to be assigned? I'm going to guess it costs money, but these folks must know that important white papers, reports and position papers need to be registered, for citation in papers. I guess we're not going to get the Library of Congress to write a rule this year...
Since there is so much diversity in this environment, we're all just trying to do the best we can. I would find it helpful if there were a widget that would scan the text of a pdf for metadata, or allow me to copy the portion of a web page with the name of the publisher or organization, and drop that copy into the widget to enter into fields. There should be html in the background to guide that assignment, and the user should expect to correct errors.
in my ideal Zotero iteration, Zotero writes to the PDF whatever we enter into the metadata fields, since Zotero already reads, it could write. Any pdf that gets dropped would automatically have a "new item" side bar, with unfilled fields. Entering a DOI might help track down metadata for a corresponding published pdf, or the user might need to fill manually or from the widget.
As an optimal feature, Zotero would have a check box for any PDF lacking a DOI or ISBN, "Request registration".
Enough for MetaData. Look for new topic "merge".
This would be if I had to copy the text from the website, put it in word and then create a pdf. the website offered a pdf but it was full of junk. So maybe that pdf has metadata and I want to add it to my pdf.
I have a program that allows me to add metadata to the pdf. (BeCyPDFMetaEdit). I want to know, what data do these metadata servers request? Can I use whatever DOI or ISBN the pdf uses? or do I need all of the title and author, etc.? by which time I have simply entered the metadata myself.
I also encounter a lot of pdfs in the non-academic quarters (government, business and non-profits) which do not even trouble to get a DOI. Who issues these things anyway, and can we request DOIs to be assigned? I'm going to guess it costs money, but these folks must know that important white papers, reports and position papers need to be registered, for citation in papers. I guess we're not going to get the Library of Congress to write a rule this year...
Since there is so much diversity in this environment, we're all just trying to do the best we can. I would find it helpful if there were a widget that would scan the text of a pdf for metadata, or allow me to copy the portion of a web page with the name of the publisher or organization, and drop that copy into the widget to enter into fields. There should be html in the background to guide that assignment, and the user should expect to correct errors.
in my ideal Zotero iteration, Zotero writes to the PDF whatever we enter into the metadata fields, since Zotero already reads, it could write. Any pdf that gets dropped would automatically have a "new item" side bar, with unfilled fields. Entering a DOI might help track down metadata for a corresponding published pdf, or the user might need to fill manually or from the widget.
As an optimal feature, Zotero would have a check box for any PDF lacking a DOI or ISBN, "Request registration".
Enough for MetaData. Look for new topic "merge".
1) have a working contract with Crossref, Datacite, or possibly another DOI registration organization - which involves some kind of payment;
2) provide the necessary metadata to describe the digital object.
So there can't be a DOI without the metadata and some money. Who is supposed to supply that, in your ideal situation?
And PDFs don't actually contain a lot of structured metadata, so it is difficult to estimate how reliable an automatic extraction would be. Title: probably. Authors: I'm not so sure. Publisher: maybe. Number of pages: OK, that one should be easy :-). Document type: forget it. Etc. As for storing structured metadata into a PDF, there are certainly ways to do it in principle but I'm not sure there is a standard choice that you can expect to be recognized by widely used software (such as Zotero or others).
Otherwise, yes, that basic data would be important, useful. Well, often enough they have a DOI in the document. The Doi is metadata, right?
Any way I know you are busy and I appreciate that you take time to answer questions in the forum. Right now I need to go weed my Zotero library.
(And DOI registration has to require a membership because organization issuing DOIs need to have some sort of plan to keep the identified object accessible, e.g. by updating a URL if an item/site moves. The actual cost of especially CrossRef DOIs is quite low https://www.crossref.org/fees/#annual-membership-fees )
And yes, there are commitments beyond just paying a few bucks for a DOI (which is actually the right order of magnitude for the price)
Even if published in the text of the document? That makes it easier, and I think what that means is that if I include the putative DOI in my pdf, Zotero will identify the document I am trying to store.