We all use AI tools like ChatGPT, and Midjourney in our everyday lives but have you ever thought of creating something like that with a feasible investment? And even that too – locally? The answer is yes we can do that, but first let’s understand why you’ll need such a system in the very first place.
So let’s assume for a moment that you have a team of designers, copywriters, and editors where everyone uses Midjourney, ChatGPT & other AI tools on a daily basis. Now, to meet their requirements even if you get 4 to 5 subscriptions for your team with 10-15 members(let’s say), you will still end up paying 15-30 thousand rupees per month. Calculating on a yearly basis, this comes to around 4 lakhs, and if calculated for long-term (let’s say 5 years) – the calculation will jump into jumps to over 20 lakhs!!
But, in the same scenario, if you had chosen to invest in a local machine running the same kind of AI models (with unlimited possibilities giving much better results), you might have saved around 7 to 8 lakhs easily answering why “I need to invest on local machine” question. In addition to that, you’ll get a customized AI model that not only answers your questions but also builds conversation while keeping the programmed replies like “As of my last update….”, “I’m here to keep things positive…”, etcetera etcetera a far away.
How to start? So it all starts with a high-end server. And here we are not talking about the latest RTX 4090s made for gaming – but technology beasts like Ampere GPUs, Xeon Ultra CPUs, SAS SSDs, and many more which will be required to run AI models which are as reliable as your service providers. For example, a good recommendation would be a pair of two RTX6000 Ada or an A6000s – both models give you 2x 48GBs of VRAM, easily capable of handling even a 40B parameter LLM. And then things come down to choosing a framework.
We are aware right that the market is full of AI chatbots. Every day or another, a new AI is released with the title of being “2 times faster more accurate and faster!”
But as we are just starting, let’s not get involved in “this model is better or that model is better” and let’s begin with Falcon-40b by HuggingFace as a start as it performs better than most models while giving you an open-source code so you can build your own AI model under the Apache 2.0 license.
Talking about the inference performance of this Falcon-40b model on two RTX 6000Ada – it is magical and takes just one second to produce a fruitful response. Thanks to our two GPUs with approximately 90GB it doesn’t lack any performance.
But here’s a note: since the model is new and there are a lot of things this has to learn, it might generate simple responses and sometimes responses that is not something a trained AI model like ChatGPT would give – telling us that there is a need of fine-tuning in the model. Here, the Falcon 40b base version enters with a lot of methods and Open-Assistant datasets providing you with the best environment to begin with model refinement and stability.
So through this blog, we’ve already come so far from knowing nothing about building an AI chatbot to building one to save the “waste cost spent”. But it’ll be really of no use until you start. So if you are still confused, you can start testing on AWS and automatically you’ll realize as per the output, the best framework for you.
But if you’re wondering what’s the cheapest option available – then don’t worry because we’re going to reveal that in our upcoming blog article. And in case you are looking to get it anyway – visit our themvp.in stores in Hyderabad, Bengaluru, Mumbai, or Gurgaon where we offer solutions for just anything.