Grok is blazingly fast. Here we have the Llama 3. 1 8 billion parameter model running next to the GPT-40 Mini model, and as you can see, Grok leaves OpenAI in the dust.

Grok uses their own inference architecture that delivers fast responses from large language models. Grok allows us to run open source models in the cloud, and importantly, offers a generous free tier as well. This is perfect for those of you who want to learn how to build AI applications without spending money, or requiring a powerful PC to run models locally.

And of course, Grok is perfect for building production-grade applications as well. In this video, we'll create a RAG chatbot using Grok and Llama 3. 1.

This demo will demonstrate the tool use capabilities of the Llama 3. 1 model, and you will also learn how to use vector embeddings for answering questions on your own documentation. Let's start by creating a new chat flow.

Let's give it a name like Grok RAG. Let's save this. Then let's start by adding a new node to the canvas.

As with all chat flows, we either need to add an agent or a chains node. Let's add an agent. Within agents, let's add the tool agent.

Let's start by adding our chat model. Let's go to chat models, and let's add the Grok chat node. Then let's attach our LLM to our tool agent.

This Grok chat node requires credentials as well as a model name. Let's have a look at setting up our credentials. First, go to grok.

com, and then sign into your account. I'll simply log in with one of my Google accounts. After signing in, click on Grok Cloud, and this will take you to the Grok playground.

To get an API key, go to API keys on the left-hand side, and then create a new key and give it a name. I'll call mine flow-wise demo. Let's submit this, then copy the key and back in flow-wise, click on the drop-down next to the credentials, and then click on create new.

Give this credential a name like grok API, and then paste in your API key. Let's click on add, and we can now move on to selecting a model. Under model name, we can click on this drop-down to see all the available models within Grok.

Let's select the llama 3. 1 70 billion parameter model. Let's set the temperature to something like 0.

6. That's all we have to do to set up our chat model. Next, let's add a memory node by going to add nodes, then under memory, let's add the buffer memory node.

Let's also attach the memory node to our tool agent. Now let's have a look at assigning a tool to this agent. This agent will be used to answer questions about a fictitious restaurant called the Oaken Barrel.

The knowledge base is simply this Word document that contains some questions and answers about the restaurant. So in order to provide information about this Word document to the agent, we need to assign a retriever tool. Under add nodes, let's open up the tools menu, and within tools, let's add the retriever tool.

Let's also assign this retriever tool to the tool input on the tool agent. Now let's give our tool a name like restaurant information. The description is extremely important, as this will tell the agent when to use this tool.

So let's enter something like provides information about the restaurant. Now we need to assign a vector store to this retriever tool. Let's go to add nodes, then let's go to vector stores, and let's simply add this in-memory vector store node.

Let's attach our vector store to the retriever tool, and now we can use this vector store to store and fetch information about our business. Let's start by loading the data from the Word document. So for the document input, let's go to add nodes again, let's go to document loaders, and let's load the docx file node.

Let's also link this docx file to the document input on the vector store. I'm also going to select that Word document from my PC. Let's also add a text splitter to this node.

So under add nodes, let's go to text splitters, and let's add the recursive character text splitter. Like so, I'm going to change the chunk size to 250 with a chunk overlap of 20. Now finally, let's talk about the embeddings function.

At the time of recording, Grok does not offer an embeddings model, or at least there's no such option in Flow-wise. So if I go to embeddings, there's no embedding option for Grok. This will use the voyage AI embeddings node.

So let's connect the voyage AI embeddings node to our vector store as well. In order to use voyage, we do need to set up credentials as well. So under credentials, let's click on create new.

Let's give our credential a name like voyage API demo, and for the voyage API key, we need to go to voyageai. com and sign in. Then let's click on create new API key.

Let's give it a name. I'll just call mine Grok, and let's click on create API key. Let's copy this key, and let's paste it into Flow-wise.

Like so, let's go ahead and save this flow. Then let's load the data from our Word document into the vector store by clicking this green button. Let's click on Absurd to load the data, and we can see that 23 documents were added to the vector store, and we should now be able to ask questions about our document.

So in the chat, let's ask something like, what are your operating hours? And we can see that the tool was indeed used to fetch this information. Let's also ask, what are your current specials?

And again, the tool was definitely used, and the correct information was pulled through. If you enjoyed this video, then please hit the like button and subscribe to my channel. If you would like to learn more about Flow-wise, then you definitely want to check out this multi-agent video over a year.

How To Build a RAG Chatbot with Groq and Llama 3.1 - Flowise AI Tutorial