Upload a File or Multiple Files and Chat with It

With the rising popularity of ChatGPT and its competitors, a common challenge is enabling chatbots to answer domain-specific questions seamlessly, especially when users want to upload and query files like Excel, CSV, or PDFs. ChatGPT introduced its “advanced data analysis” feature a few months back. Now, various platforms, including Relevance AI, Quivr, and PDF.ai etc., offer similar capabilities. But which one should you pick?

ChatGPT’s advanced data analysis, earlier known as the “code interpreter,” is noteworthy. When users upload a file, like a PDF, it doesn’t just provide an answer. It also details the coding steps behind that answer. However, this feature can be a double-edged sword. For users who don’t have a coding background, this detailed breakdown might seem overwhelming and not very user-friendly. Additionally, while its answers are accurate and can be adjusted by those who understand the code, the tool sometimes runs into hitches. For instance, if there’s an image in a PDF, the user has to instruct the bot to deploy OCR technology for extraction, which, although ChatGPT can manage, adds an extra layer of user involvement.

On the flip side, platforms like Relevance AI and Quivr present users with a sleek interface where they can easily upload their files and start their queries. Quivr stands out in that it doesn’t bill you through its platform; instead, you plug in your API key and bear costs directly through your OpenAI account. However, Quivr does have limitations. In my tests, it restricted usage to the gpt-turbo-3.5 model, which, while cost-effective, sometimes falls short in performance. A deeper look into its GitHub revealed that the processing is done using Python libraries with limited flexibility, especially when a PDF has mixed content like text, tables, and images. This limits its adaptability, so unless your documents are purely textual, Quivr might not be the best choice, despite some commendable design aspects.

PDF.ai, in my view, truly shines. As its GitHub profile suggests, PDF GPT is designed to let users converse with the contents of their PDFs using GPT’s capabilities, branding itself “PDF GPT allows you to chat with the contents of your PDF file by using GPT capabilities. The most effective open source solution to turn your pdf files in a chatbot!” They’ve incorporated backend solutions that leverage OCR and provide options to switch to models like text-DaVinci-003 or GPT4, which have demonstrated better efficacy. Being open-source, it’s not just a tool I’d recommend for conversing with your PDFs, but also a resource for me to craft a tailored applications for our own knowledge domain chatting app.

with the explosive growht of chatGPT and its rival products, the biggest barrier for most people/companies is to have a chatbot answers qustions particular to their domain fed in by “no-code”ly , for example, user wants to upload a specific excel/csv file, pdf file or files and then ask the bot to get answer instantly and accurately.
so chatGPT launched this “advanced data analysis” a couple of months ago, and there are more than a dozon of apps online such as Relevance AI, quivr, pdf.ai etc. that allows user to do so.
however, what’s the difference, which one should you choose to use? Diving into the details of their codes or mechanism beneath, I could draw some of my conclusions.
first, i really like the chatGPT’s advanced data nalaysis function, formerly known as “code interpreter”, as is more eptly named, because when you upload a file such as pdf file, it not only provides you the answer you may seek, but also details steps in coding, for bit complex jobs, you may encounter quite a lot of “apologies, seems it doens’t work or forgot to install a library” answers. for people whoe are not familiar with basic coding, it is not friendly nor fast. but the result is accurate and adjustable if you know the stuff. additionaly, some time you might be hung there because there is an image in the pdf file, so you need to further instruct it to use OCR technology to extract info out, which chatGPT will happily comply and carry on.
On the other hand, tools like Relevance AI, quivr, PDDF.ai etc. provide clean neat interface for you to upload your files and then start chating with it. I like quivr in the sense that it doesn’t require you to pay the platform, but plug in your own API key and incur fees just on your own openAI account. The model from quivr is not flexible, as when I test it these days only gpt-turbo-3.5 is allowed, which is the most cost effective model but the performance is lackluster. Diving into its github, the retrieving and process is accomplishedy python libraries to read in wihtout further option to manipulate to deal with scenarios where one pdf contains text, tabular and image info. it’s at the ends of ability. Hence, unless all your documents are textual, I won’t recommend using quivr even I do like alot of its design.
What stand out i think is pdf.ai. As it claimed in its github profile, PDF GPT allows you to chat with the contents of your PDF file by using GPT capabilities. The most effective open source solution to turn your pdf files in a chatbot! I particularily like the author’s public note on his opinion about too generic, medocre perofmrnace of git-3.5-turbo, instead, providing various solution in back end by applying OCR suport as well as swapping modesl with old text-DaVinci-003 or use GPT4 proves to be more effective. it’s also opensourced, so not only I recommend this app to chat with your pdf files, but also aim to create our customized app by referencing his approach.

Leave a comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.