Extracting annotations from PDF breaks Korean characters

Problem statement
- When I added PDF Korean patent document from Google Patents, and opened it with built-in reader, texts are not properly extracted. It shows some character change (seems not related to encoding, though).

[right]Copied from External PDF reader(SumatraPDF): 본 고안의 관절 도어스토퍼는 원하는 각도로 개방된 도어를 열린 상태대로 걸림 고정되게 하는 것으로

[wrong]Extracted annotations and text copied from built-in pdf:
본 고안의 관윈 도어스토퍼는 원하는 각도띜 개묩댜 도어를 열린 상태대띜 걸림 고윕댘게 하는 것으띜



절:윈
로:띜
정:윕
되:댘

I am struggling to solve this, but I have no idea where to start looking from.
Any helps would be greatly appreciated. Thank you.
Sign In or Register to comment.