specter119
About
- Username
- specter119
- Joined
- Roles
- Member
Comments
-
@martynas_b Thanks, I can't recurrent the case that GBK works better than UTF-8, I test 2 Chinese PDFs files, and the two methods return the similar full texts. The Chinese PDF I used to convert to text, may be too old. As for the DOI, both cnki.ne…
-
@bwiernik Yes, but "text form PDF" is related to the process of fulltext indexing, and this process used utf-8 as default decode method in the past(when the user have to download `pdfinfo` and `pdftotext` binaries, and It seems to bundled with stand…
-
@bwiernik Threre are two issues for retrieve metadata from Chinese PDFs. For one thing, Chinese PDFs usually don't have valid metadatas for they mostly converted from ms-word(by editor maybe?). For another, decoding with GBK is more properly than wi…
-
People who use zotero and BBT together will all vote for this I guess. Search set is a far more smart collector than the category.
-
Should I submit GBK encoding request to github for developers?
-
@adamsmith thank you, i got your point.
-
I see, thanks for your work !
-
@adamsmith I couldn't agree with @WarthogARJ no more , is this project still working?