Language Translator
Project Goal
The goal of this project was to translate a given PDF file (ex: academic paper) into the language you’d like by using ChatGPT 3.5 via API call.
Note: When this project was developed, ChatGPT 4 API call was unavailable.
Library
Here are the main libraries I used for this projects:
I used pypdfium2 for extracting texts from a PDF file, NLTK for tokenizing sentences, Pandas for saving the API call log mainly, Pickle for saving
the translation results, openai for API call.
API Call
Since openai API call had request, okens limit and for better translation results, I broke down a document into paragraphs level and here I selected 5 sentences as a paragraph and save it as a txt file.
Then, iterated all the text files and made API calls to ask ChatGPT to translate each paragraph, and saved it as a pickle.
The final output txt file would be a single file with all the translated results.
Note
I found out with clear prompt messages, ChatGPT would perform better.
For detail codes and graphs, please check the GitHub link above 👆
photo credit: https://www.pcmag.com/news/more-workers-are-using-chatgpt-and-theyre-not-telling-their-bosses