Usability to scrape forums?
I am working on a translator to scrape a forum (IPBoard3-based) and have reached the point where I now have all that I want, per conversation/thread, in an array. The forum contents will be used to write a User's manual and Developer's manual for the product the forums are there to support.
My intent is to have Zotero database whole conversations - *not* to take snapshots of whole web pages (and not selecting numerous sections of the page, either).
(I will continue to develop the translator to organize the contents of the array into a new Zotero Item and record it to the database.)
My concern is:
Am I using a sufficiently versatile tool to accomplish this?
What field in the Zotero field set is set to store and deliver an array? (The Note field of a Zotero item seems to hold only text.)
Specifics: the conversation array is as follows...
'Title' => "Topic Subject"
'Type' => "Forum Post" (but it is more an entire thread)
'Thread' => array( [0] => array( "poster" => "post" ) ... [n] => array( "poster" => "post" ) )
Will I need to create a new item format?
My intent is to have Zotero database whole conversations - *not* to take snapshots of whole web pages (and not selecting numerous sections of the page, either).
(I will continue to develop the translator to organize the contents of the array into a new Zotero Item and record it to the database.)
My concern is:
Am I using a sufficiently versatile tool to accomplish this?
What field in the Zotero field set is set to store and deliver an array? (The Note field of a Zotero item seems to hold only text.)
Specifics: the conversation array is as follows...
'Title' => "Topic Subject"
'Type' => "Forum Post" (but it is more an entire thread)
'Thread' => array( [0] => array( "poster" => "post" ) ... [n] => array( "poster" => "post" ) )
Will I need to create a new item format?
This is an old discussion that has not been active in a long time. Instead of commenting here, you should start a new discussion. If you think the content of this discussion is still relevant, you can link to it from your new discussion.
Upgrade Storage
I would model this as an article type (or book section?). A journal article could represent a post, threads could be journal issues. If discussion is not threaded, this might work. You'd just loop through your arrays and create an item for each entry.