In this video I'll show you how to build an AI Voice Assistant with python this assistant will have agent capabilities which means it will be able to interact with a database call python functions and really do anything that you want now after we build the AI Voice Assistant which actually won't take that long I'll show you how to integrate it into your own custom front end so you can see how to interact with this in the real world This is super cool I guarante you're going to learn a ton and you'll see how simple it
is to actually make these agents even if you're not a complete AI expert with that in mind let me give you a quick demo of what we're going to build and then we'll dive into the tutorial and I'll walk you through it step by step so what I did is I just made a really simple landing page for something like a car call center right where maybe you need to call in if you Want to schedule a service appointment or if you need to do an oil change or whatever it might be and typically for
these type of places you're going to talk to someone whether that be on the phone and you have to go through all the AI messaging where you have to give like the VIN of your car the date of the appointment all that kind of stuff so I wanted to show you how you can automate that using an AI voice assistant so let's have a look so there's this button Here that says talk to an agent if I do this I can enter my name and then I can connect into this kind of AI Voice Assistant
system and I'll show you how it works so let's do this Welcome to our Auto Service Center for us to assist you further could you please provide the VIN of your vehicle if you don't have a profile with us yet just let me know and we can create one for you okay so here it's asking me for the VIN typically again when you go through These system systems it needs to look up some kind of profile you need to have the information on the car so what I'll do is I will um tell it to
make a new profile as if I don't have one hey can I make a new profile please I don't have a vinon file of course to create a new profile I'll need some information about your vehicle could you please provide the make model and year of your car sure this is a Jeep Wrangler 2025 and the VIN is V123 your new profile has been created for your 2025 Jeep Wrangler how can I assist you further today can I schedule a service appointment for this please I'll transfer you to our scheduling Department to book your service
appointment for your Jeep Wrangler please hold for a moment while I connect you perfect so that's exactly what I wanted it to do here now look there's obviously a lot more functionality that this has but the idea is this is Actually able to interact with the database so when it's said you know can you give me this information about the car the reason why it's doing that is it's actually going to save this in the database so then next time if I were to come back to this it would already have that profile saved based
on my VIN it would be able to look up the car information and then we could forward that over to the scheduling department or whatever Department it is that we Need to go to and they already have all of the data that they need to then assist us with booking an appointment or whatever it is that we wanted to I didn't build out a ton of features here I wanted to keep it minimalistic so that you guys can extend it to what you actually need but I promise you in this video you're going to learn
a lot about how this works and you'll understand how to change this to your own custom situation or whatever it is that you Want to build out because I know a lot of you probably aren't going to build this exact example so with that in mind let's get into the tutorial let me show you how you build something like this the first half will be more about building the agent and then the rest of it will be actually integrating this into the front end so let's begin here by learning a little bit about the framework
SL technology we're going to be using in order to build out this Project and what I'm going to be using for this video is live kit now they are the sponsor of this video and I've worked with them in the past and you guys really enjoyed that other video we did which is why we teamed up again now they are open source they're completely free you do not need to pay for them at least if you're just using it in a tutorial setting here and they are actually what's used by a ton of massive companies
including open aai so if You've ever used open ai's voice mode before that's actually powered by live kit and the reason why they use live kit is because this is really the best in the business when it comes to doing ultr low latency transportation of voice video audio data all of that kind of stuff so what you can do with live kit is obviously build real-time AI systems like we're going to be doing here with the ultr low latency Voice Assistant but you can also do video conferencing you Can have audio conferencing apps you can
do all kinds of really interesting stuff here and if you want to check them out again do it from the link in the description but in this video I'm going to be focused on The Voice Assistant agent but of course this has a lot of other functionality so anyways in order to get started with live kit here what we're going to do is just make a new account again it's free you do not need to pay for this and the nice thing about Live kit is that if you don't even want to use their cell
hosted um options you can actually host the live kit server completely on your own so if you do build this out and you decide that you want to deploy it you're not tied into this kind of environment or this pricing model you can just go and deploy it by yourself but obviously it's easier if you just use the live kit kind of selfhosted environment so what I'm going to do is go to cloud. live kit. it's Going to bring me into my account because I already have one but if you don't have an account please
just go ahead and create one and then what we're going to do is make a new live KB project and from there it'll be very straightforward to get started so what I'm going to do is go to the bottom here and just press on create new project you can see I already have some spun up here I'll give this name and I'm just going to call this Car service center or something like that okay create project and then I'll tell you where to go from there all right so I'm here in the project here and
honestly there's nothing that we need to do to set this up other than to just get access to a few of the API keys that we'll use to connect to the live kit server now before we go through that I will just mention that if you do want to mess around with this on your own they have a bunch of great quick start Guides and agent guides and that's what I follow in order to build out the code that you're going to see me write so if you want you can just read the documentation directly
but obviously walking through the video I'm going to explain some more Concepts that aren't in there okay so that's pretty much it what we're going to do next is we're going to open up a code editor so in my case I'm using VSS code and I'm just going to start by making a EnV file here Where I'm going to paste in some of the keys that I'll need in order to connect to my live kit project so let's just get that out of the way right now and and then I'll talk about the architecture and
what to expect when we build this out so in order to get the keys for our project we're just going to go to settings and we're going to go to Keys now for keys what we'll do is we'll just press create new key I'm just going to give this a name so I'll just say this Is my Windows machine you can call it anything that you want and then what this is going to do is give you an API key a secret and a websocket URL for now we'll just copy all of them so let
me just split my terminal here one second let me get this guy open over here first environment very variable that we're going to have is going to be the live kit URL so we're just going to copy this websocket URL here for the live kit URL let me just make this a little bit Larger next we're going to have our live kit API key so we're going to say live kit uncore aior key is equal to and then same thing copy the API key obviously you don't want to leak these I'll leak them to you
now and I'll just delete them after and then we're going to go live kit uncore secretor key is equal to and then this okay so make sure you copy those correctly into these variables because obviously we'll need those in order to connect to life kit okay now That we have those the only other thing that we're going to need to continue with this project is an open AI API key now actually you can use any AI provider that you want with live kit or at least there's a lot of different options for example you could
use AMA and you can run this locally on your own computer but it's just a lot easier to use open AI so what I'm going to do is I'm going to say open aore aior key is equal to this and I'm going to go and get my Openai API key now in order to do that we can go to platform. open.com API keys I will leave this link in the description I believe you will need a credit card on file in order to do this but it should be free or it should cost you just
a few cents um now again if you don't want to use this you can look in live kit and there's a lot of other options for the AI models again open AI is just going to be the easiest and the fastest in this case for us so What I'm going to do is make on make a new key so create a new secret key just call this live kit car service or something okay let's create that key and then we will copy it into our project so I'm going to copy it here in that environment
variable file perfect let me get out of that and now we will go back to vs code and we will start writing some code but first actually I want to explain to you the architecture and kind of design of how live kit works so I've Got the live kit documentation opened up and I just want to show you this diagram here and explain to you what's actually going to go on in terms of the flow of data here you don't have to understand if fully but I think it's a good idea before we go much
further so what we're going to be focused on building right now is this agent okay so You' see where it says your backend and then it says agent this is what we're going to build now this agent will connect to the live Kick cloud and that will be connected to our various clients so essentially what will happen is we'll have some client whether it's a website whether it's our phone whatever right they will connect to the live kick cloud and when they do that they're going to open up a new room now immediately when this
room is created our agent will join that room and then start communicating with the client now that communication will happen through web RTC which will be Facilitated through the live kit Cloud this means we don't need to deal with all these transfer protocols and sending the data back and forth live kit does that for us we essentially just Define what we want the behavior of our agent to be and then everything is taken care of by live kit now obviously our agent will also connect to open AI in this case we're going to use something
that's relatively new which is the realtime API which allows us to have extremely low Latency in the responses and as you saw it was very very very fast it's actually insane how fast it is because we're using this new realtime API feature now there are some other uh different things again you can build here with live kit but in our case we're going to do the open a realtime API so this documentation I will link in the description and you can read through it if you want to understand it more but that's the basic architecture
we design Our agent some client joins the kind of live kit session and then our agent will join the room with that client start communicating with them and all of that happens through web RTC which is facilitated by live kick Cloud which is very very fast now one thing to note is that you can have multiple agents running at the same time and joining multiple rooms so again live kick Cloud will handle that concurrency for you and you can have multiple rooms at the same Time which means your agent can be talking to multiple people
and have different sessions or kind of conversation chains going on hopefully that makes a little bit of sense but with that in mind let's start building this out okay so in order to get started here we're going to create a requirements.txt file and I'm just going to list the various requirements that we need in order to actually work with live kit so we're going to start listing them Out and all of this code will be available from the link in the description uh there's be a GitHub there in case you just want to copy it
in so first we're going to have live kit agents then we're going to have the live kit plugins if we can spell this correctly Das openai then we're going to have the live kit-- plugins Das solero this will be used for voice activity detection we're going to have python. EnV this is going to be used for loading Our environment variable we're going to have the live kit API which we're going to use later on in the video and then we're going to have some dependencies related to flask so we're going to have flask async we're
going to have normal flask we're going to have flask cores and we're going to have uvicorn for running our flask server now the last five dependencies that you see here will be used later in the video for handling author ization and issuing tokens to our Front end in order to connect securely to the live kick Cloud again you'll see how that works later but for now we're just going to be focusing on these but I just figured we'd put them all in one file so we can install them all at once okay so now that
we have that what we're going to do is create a virtual environment and then we're going to install those dependencies now you don't need to make a virtual environment but I would recommend that you do just to Separate out the dependencies so to do that I'm going to type python dmvnv then I'm going to give the virtual environment a name and in my case I'm going to call call it AI now if you're on Mac or Linux you can change this command to Python 3 and then when we run this it should create the virtual
environment for us and then next we can activate it now this will just again separate the dependencies for us so what I'm going to do now to activate the Virtual environment is type slash AI SL scripts SL activate now you should do this in Powershell if you're on Windows and this will then activate the virtual environment for you however if you're on Mac or Linux your command will be like this source and then /ai SL bin and then it should be slash activate like that it's just different if you're on Mac or Linux compared to
Windows so now that you have the virtual environment activated what we want to do is install The dependencies so we're going to type pip install - R and then requirements.txt this should install all of the requirements we listed in that file in our virtual environment so then we're able to use them in the project and then we can start writing some code all right so all of that is finished and now we're going to start building out the agent so what I'm going to do is make a few different files I'm going to make a
file called agent. py this will Be the entrypoint script I'm going to make a file called api. piy this is going to specify some of the tools that our agent will be able to use I'm going to make a file called the database driver. piy which will handle managing our database and connecting to that for things like our vehicle information and then I'm going to make one more called prompts dopy which will have some of the prompts that we'll use for our AI agent okay so I'm not going to write all of This completely completely
from scratch but we will with the agent file at least so for our agent file this is going to be kind of what handles actually creating the AI agent The Voice Assistant which will connect to live kit so to do that we're going to need a few different Imports so we're going to start here and we're going to say from uncore futurecore uncore import annotations then we're going to say from live kit. agents import and we're going To import the following dependencies which is going to be Autos subscribe job context the worker options the CLI
and the llm we're then going to say from live kit. agents. multimodal import the multimodal agent okay we're then going to say from live kit. plugins import open aai and we're going to say from. EnV import load. EnV we're then going to say Port OS okay so that's all we need for now next what we're going to Do is we're going to load our environment variable file so we're going to say load. EnV this will simply load the EnV file in this current directory and then we'll be able to access those various keys that we
need for live kit okay perfect so now that we've done that what we're going to do is start defining some functions that we'll need to use so we're going to Define an async function and we're just going to call this the entry point okay then we're going to Take in some context and this is going to be some job context from live kit the first thing that we're going to do inside of this file is we're going to wait to connect to live kit so to do that we're going to say await CTX doc connect
and we're going to say Autos subscribe is equal to Autos subscribe. subscribe all okay if you hover over this you can see exactly what it's doing but this is going to connect to a live kit room and we're going to subscribe to All of the various tracks that we have so that will be the video track the audio track and the text text track in that room because again live KCK can handle multimodal so not just voice it can handle video it can handle text as well and you can see if I go over here
it says whether to automatically subscribe to tracks the default is to subscribe to all which is what we're doing okay the next thing we're going to do is we're going to say a wait CTX Dot And this is wait for participant so this pretty much means we're going to wait to get the identity of the current participant and just make sure that they're in the room before we start talking to them obviously okay next thing we're going to do is we're going to define the model that we want to use so we're going to say
model is equal to openai do realtime do the realtime model that was a lot of different models you can use here in our case we're just Using the open AI one right now for the real-time model what we need to pass to it is some instructions this is essentially what we want it to act as we can give it some kind of system prompt or something right so it knows how it should be communicating with the user we then can Define The Voice notice there's a bunch of different options here I'm going to go with
the Shimmer but you can choose really whatever you want if you want a different style of voice then we Can Define the temperature of the model so I'm going to go with something like 0.8 and then we can Define the modalities so essentially what type of medium we wanted to work with now in this case I'm going to provide audio and text so that it will give us voice and it will also generate the text associated with that voice okay now after we've done that what we can do is we can actually create an assistant
that's attached to this model so if you Ever worked with AI agents before you know that essentially what you do is you give an AI model access to various tools that it can call and that it can utilize during its um invocation right so when you actually ask the AI model something it can go and look at the various tools it has access to and then use those tools according to what the user requested right or whatever it thinks it should be using so what I'm going to do is I'm going to set up a
class that will Contain all of the different tools that this AI agent will be able to use now those are the tools related to interacting with our database looking up a car based on its Vin creating a new profile file it could be scheduling a meeting could be calling an API you can do anything you want with the tools here and I'll show you how to get that started so what we're going to do is go into our API file and this is where we're going to Define our tools now I'm Just going to stub
out what the tool class will look like for right now so we can actually test the AI agent but then later we'll write them all out so what I'm going to do is I'm going to say from live kit. agent import LF then I'm simply going to say class assistant and then fnc setting for function and then I'm going to inherit from llm Dot and then the function context okay then all I'm going to do inside of here for now is I'm going to Define a simple Initialization I'm going to take in self and then
I'm going to say super doore uncore nitor uncore and just call the parent Constructor later on we'll write the rest of this class body and we'll Define some functions that will act as tools that the AI agent can use but for now we just need this in order to invoke the I agent okay so now that we have that we're just going to import this assistant function so I'm going to say from and then we called this API import The assistant function now beneath our model we're just going to create a new instance of this
so we're going to say the assistant and did I spell that correctly I'm not sure if I did let's see assistant there you go is equal to and then this is going to be the assistant function I think I spelled that one correctly let's just make sure I'm just going to call this that we create a new instance of it and then I'm going to say my assistant is equal and Sorry this needs to be assistant fnc I'm going to say my assistant is equal to a multimodal agent we're going to say the model is
equal to the model that we defined and we're going to say the function context is equal to the assistant function okay so again what we're doing is we're making an agent now this agent will use this model that's where most of the logic comes from and then we're providing a set of tools which are provided in inside of this Class that this agent will be able to interact with sweet so now that we have that what we're able to do is we're able to actually start the assistant and connected to the room so we can
say assistant. start and then we can provide the room so we can say CTX do room so what we did up here is we waited to connect into this room and for the participant to join we then created a new AI assistant and then we said that the AI assistant will now join the room So now the AI assistant joins the room and it can start interacting with the part participant sweet now what we need to do is we need to tell the AI assistant to do something so maybe to talk to the user or
to greet them so to do that we can say session is equal to model. sessions and we'll just grab the first session that we have and then we're going to make a new conversation item I'll talk about how this works in a second but let's type it out we're going To say session do conversation. item. create and then we're going to make a new llm chat message this message is going to have a role equal to assistant and we are going to have the content equal to and then we're going to provide the content here
in 1 second now all we're doing is we're just adding a new chat message into kind of the context or the current session and then after we add that chat message what we can tell the assistant to do is Respond to that message so now I'm going to say session. response. create and what it will do is it will read this conversation item and it will then generate a response and that response will be vo right will be text as well and it will actually speak out to us and we'll say what it said so
that's that so now that we have that we actually have a functioning AI assistant I'm going to show you how we test it in just one second but we of course need to call This entry point function so to do that we're just going to use the classic if uncore uncore name is equal toore maincore uncore then we're going to say c. runapp and we are going to use these worker options here and we're going to pass the entry point function as our entry point okay don't worry too much about this what this is going
to do is it's essentially going to run this entry point function asynchronously for us and then we will automatically connect to Our live kit Cloud the reason why this will work is because we've provided these values in our environment variable file and these are the values that are being looked for by live kit in order to make the connection so make sure that you name these environment variables exactly what I have here you need all four of them for this code to work and then they'll automatically be searched for for live kit we'll connect to
the live kit cloud and then we can handle Any of the clients that are connecting here and add our AI Voice Assistant into the room so you can see this is very straightforward and what will happen now once we run this code is we'll wait for someone to connect once someone connects then we will run this entrypoint function and then we will generate a response and we'll just say one thing to the user okay so last thing we need to do is just write these prompts we need the instructions and we need the content So
let's go inside of our prompts py here and let's just copy in these prompts I don't want to spend too much time writing them so I'm just going to copy them in here you can change them to be anything that you want and you can copy them from that link in the description so first one you're manager of a call center you're speaking to a customer blah blah blah and then the welcome messages Begin by welcome the user and ask them to provide the vent of Their vehicle again adjust this to whatever your situation is
okay so now we need to import these and use them so I'm going to say import or I'm going to say from prompts and then we're going to say import the welcome message and what do we call the other one the instructions okay and then I'm just going to plug in the variables so this will be instructions and this here will be the welcome message perfect now that we have that we're actually Good to go ahead and run this and then I'll show you that livein actually has a playground that we can use in order
to test this agent before we start integrating it with our front end so I'm going to bring up my uh terminal here and I'm simply going to type python the name my file which is agent. py and then Dev okay Dev is how you run this in development mode so I'm going to hit enter we're going to run this we're going to see if we get any errors and it Looks like we maybe spelled something incorrectly so it said add live kit API Secret in your environment so it's possible that I just spelled secret wrong
here so let me just paste this here and yes I did okay so it should be live kit API secret you can see it's telling me that error other than whatever I had there so let's change that to live get API Secret apologies about that guys and let's rerun this so we can see now that this is running and The next step is obviously to test this now in order to test this without writing our own custom front end live can actually provide something called the agent's playground so what I can do is just search
in my um what is say web browser here live kit agent playground I'll leave a link to this in the description and if we click into this here it will actually allow us to connect to our project and start interacting with the agent so you may Need to sign in with live kit but I believe my credentials are already here so I'm able to get access to it and then what I can do is just click on the car service center which is my current project go connect and we'll be able to mess around and
hear our agent ideally so I'm going to be quiet now so we can hear this thank you for choosing our Auto Service Center to assist you better could you please provide the VIN of your Vehicle if you'd like to create a new profile just say create profile I don't have a VIN no problem at all let's create a new profile for you could you please provide your name phone number and email address to get started tim1
[email protected] thank you Tim your new profile is now created how can I assist you with your vehicle in okay
so that's that I'm just going to mute my mic and disconnect here You can see that this is working now obviously it's just kind of making up replies after this point because we don't have any logic for what we wanted to do but you see that there you go it worked right we connected and it was able to respond to our messages and actually it did keep going and give us some conversation but just not in the um kind of format that we would have liked so that's how we test it I'm going to leave
this page open I can always Reconnect by just pressing connect here and then you can see the agent will connect now if you have issues with the agent connecting just make sure your keys are all correct I know a lot of people were having issues with this in the previous video and I believe it's because they weren't using the correct environment variables anyways at this point I'll assume that you were able to connect again really the only thing that could go wrong is your code for some Reason has a mistake it doesn't match what I've
written here and again you can copy this from the link in the description or you just don't have the correct environment variables there so just make sure that you double triple check those because I know personally I had some issues with them because I just had the wrong values in the wrong slots silly mistake but it happens okay so now that we have this what I want to do is I want to start providing some tools that My agent can use I want it to be able to actually save vins or look up vins or
create a new profile and in order to do that I'm going to write this database driver file now I'm just going to copy in the code here again you can find this from the GitHub Linked In the description the reason being I don't want to waste your time writing a bunch of database related code but I will walk through it so you can understand what it's doing and how this works okay so For my database driver I'm just copying this in so it's 60 lines of code and this connects to a SQL light 3 local
database now you can use any database you want you can use mongodb you can use a Json file you can use an inmemory database it doesn't matter I'm just doing a quick example here so you can see how we kind of interact with data uh with our model and you can even use something like retrieval augmented generation if you wanted to I'm just not Going that far in this video okay so you can see that I bring in my sqlite 3 uh package okay I have all of this other stuff data class uh context lib
all this stuff I create a data class with my VIN make model and year this is just representing the information that I want to store in my database I create new database driver class I take in the database path so where I actually want to save this and then I just initialize the database now initializing the Database is right here where what it's going to do is create a new table for me in SQL light with my VIN make model and year if it doesn't already exist okay that's it then I have this get connection
this is a context manager which is simply going to yield the current connection to our sqlite database so we don't constantly keep reconnecting to this if we don't have to okay sweet going down here I then create two functions create car and get car by VIN this is really all that I need I need the ability to make a new car and I need the ability to get one based on its Vin if you want to do something like delete a car or you wanted to look up multiple cars or you wanted more operations of
course you could write them in here I just kept it nice and simple to create a car we take in the VIN make model and year and then we are going to return a new car and what we do is we just get the connection we get the Cursor we insert into the database this information we commit that and then we just return the car so we can then use it right same thing here with get car by VIN we get the connection this time we just search for a particular Vin which should be a
unique field we fetch that value if it doesn't exist we return none otherwise we return the information about the car again you can copy this from the link in the description if you want to use this exact code okay so There we go we have our database driver and what I need to do is I need to connect the database driver to my API file so what I have to do is I have to make functions that my llm is able to call inside of this class and then in those functions I can do literally
anything that I want so I'm just going to start with some imports here and I'm going to say import enum I'm going to import my typing or I'm going to say actually from typing if We can spell this correctly import annotated and then I'm going to say import logging just so I can print some stuff out and then I'm going to say from my database driver import my database driver okay now first thing first I'm going to set up my logging so I'm going to say logger is equal to logging doget logger and I'm just
going to give this a name and I'll just call this user- dat then I'm going to say logger do set level and I'm just going to set this to The logging doino level so I can just display some kind of information in my terminal when we start start using these functions you can see that they are indeed working I'm then going to make a new database instance so I'm going to say DB is equal to database driver and then I'm going to make a simple car which just defines kind of an enum for the different
fields in my car class so I'm going to say class car details and this is going to be an Enum do enum you'll see why we need this in one second I'm going to say Vin is equal to Vin I'm going to say make is equal to make I'm going to say model is equal to model and I'm going to say year is equal to year you might be wondering why we need this again just wait one second it's so that we can annotate what should be passed to these functions by using this python enum
it's important that when you write these functions you do all of the typing for Them in Python so that the model knows what values to pass for the various parameters to these functions when it starts using it as a tool okay now what we're going to do pretty much inside of this assistant function is we're going to store some context and that context is going to be the car that the user is talking about the idea is that whenever you call something like a call service center or a car service center sorry you first need
to know what car you're Talking about right unless you're buying something so it's either a car that already exists that you been there before or it's a new profile so we want to store what car the user is currently discussing so that the llm has that context and information and it can use it throughout the discussion so I'm going to say self. carore details is equal to this now the reason why I start this with an underscore is that I make it a private member in Python it's not Actually private because you can still access
these values but it's a convention that you use if you don't want something to access this or change it so again it's not truly private but it's a convention in Python so that something outside of the class doesn't try to modify this attribute now what I'm going to do is I'm simply going to say card details do Vin and then this is going to be an empty string now notice I'm using the en and the reason I'm Using my enum is so that the model is always going to know what field it should be using
or the name of the field when it's trying to access values in my class again you'll see this in a second but just bear with me so I'm going to say card details. Vin and then going to say card detailsmake is equal to an empty string I'm going to say card details. model is equal to an empty string and I'm going to say C details. year is equal to an empty string and These will all be filled in later okay so this is some kind of General context that I want to store here and you
can store anything that you want and then beneath this I'm going to make a few functions that will act as the tools that the llm can use now in order to do that you need to decorate the function with this llm AIC callable and then you need to provide a quick description of what this function does and when it should be called so I'm just going to Say look up a car by its Vin okay then I'm going to Define look upore car as my function name it's going to take in self and it's going
to take in a parameter and that parameter is going to be a VIN now for the VIN I need to specify what the type of this will be now I'm going to say this is annotated it will be a string and it will be lm. type info okay and then it's going to say description is equal to the VIN Of the car to look up okay so let's spell look up correctly great and then I'm going to put a CO in and then I'm going to start writing my function now let me just make this
a little bit larger so you guys can see it so this is how you're going to write all of your different tools you're going to start with this at lm. AIC callable you give a description this will be read by the model to determine If it should use this function or not you then provide the parameters you don't need to provide any type annotation for self because that's implicit before any other parameters you add you need to Define what they should be so that the model knows how to call so I'm using this annotated thing
which allows me to say hey this is of type string and here's some extra information about this parameter so the model knows what it should use here so we're just Describing hey this is the VIN of the car that needs to be looked up if you had a more complex parameter obviously that would become more important okay now right away what I'm going to do is I'm just going to say logger doino and I'm just going to say lookup car okay and then I'm going to say Vin is equal to percent and then s and
then I'm just just going to passess the VIN here just so that we have some output and we can see what the model's actually Using to look up the car when it calls this function so just some kind of basic debugging essentially then I'm going to use my database to look up the car so I'm going to say result is equal to db. getet car by VIN I'm going to take in my VIN I'm going to say if result is none then we're just going to return car not found okay otherwise we're going to set
our card details so we're going to say self. card details is equal to and now we're going to write our field so I'm Actually going to copy this okay and we're going to specify our Vin make and model so I'm going to say result. Vin result dot what is this make result. model and result. year obviously we can store more information but for now that's totally fine so what we're doing is we're looking in the database for this car based on the VIN that was passed by the AI model if we don't find one we
say not found Otherwise we set the context and then we're just going to convert this into a string and return it to the model so we're going to say carore string is equal to an empty string we're going to say for key comma value in self doore let's do this doore car details do items then we're going to say the carore string plus equal to and we're just going to convert this to a string so we're going to say key colon value and then back Sln and then we are going to sorry this value needs
to be inside of a set of braces then we're going to return an F string and we're going to say the car details R and then we're going to pass the car string okay so the reason why I'm doing this is I just want to give a string back to the model so I can tell it hey here are the C details that you found and then just give it all of this information rather than trying to return it in a python dictionary that could be Problematic now I'm just going to make a quick method
that will just do this for us because we will use this later on so up here I'm just going to say Define and I'm going to say let's go get car string we will take in self and then all we're going to do is just copy this right here okay and we're going to do this and then we're going to say return car string and then I'm going to go here and I'm just going to use the self. getar string self. getar string like that so now this Is inside of here and it's a little
bit more reusable okay great so we have this first method now that allows us to look up a car what I'm going to do is I'm going to copy this at least the function definition and I'm going to write a same thing now for creating a car so rather than looking up a car we're just going to go with creates car now for the description we're going to say create a new car okay that's really all we need and for creating a car we're going to Need a VIN a make a model and a year
so let's make this a little bit easier to read okay and let's start defining the different parameters so I'm just going to copy this a bunch of times because I don't want to have to keep writing this okay one more time and we're going to change this to make model and year and to make this a little bit easier to read let's go back here let's shift tab this back and let's put our parentheses in the correct place Okay so now for the make this will be a string okay and rather than the VIN of
the car we're going to say the make of the car and I can remove this two lookup thing okay and just keep changing these and adjusting them to be correct okay so now for the model we'll say the model of the car and then this will be the year of the car and for the year we're going to change this to be an in uh not a string okay okay perfect so now we have our parameter self Vin make model year That's what we need now we're going to log the information that was passed so
we're going to say logger dotinfo we're going to say create car and then we're just going to pass the details so we can see what they are so we're going to say Vin percent s we're going to say make percent s we're going to say model percent s and then year percent s and then we're going to pass these values in order so we'll say Vin make model and here okay great now we just need to make A car so what we're going to do is say result is equal to db. create car and we'll
pass the VIN make model and year and then we're going to say if the result is none so if there's some kind of issue here we're just going to return failed to create car otherwise we're going to set these details so I'm just going to copy this here and paste this here because it's going to be the exact same thing so we're just going to grab the result and Set it and then we can just return a string this says car created perfect okay now we're just going to write two more methods one more method
I'm going to have is just going to be an internal one or I guess external but it's not going to be one called by the llm I'm going to say Define hasar self and all I'm going to do is say return self theore card details and I'm going to look at the C details. Vin feed F and I'm just going to see if this is empty Or not so if this field is not equal to an empty string it means we do currently have a car which is something we want to know in order to
change what the model is asking the user okay and then last thing we're just going to have this get car details function so I'm just going to copy this again I'm going to say at llm a callable rather than create car this is going to be get carore details which is a method that the llm can use to get the details of the car that it's Current Curr ly talking about and all we're going to do is say return and then an FST string and we're going to say the car details R and then we're
going to say self doget car string like that and for the description I am simply going to write here get the details of the current car meaning the car that we're currently discussing and then last thing we're just going to log a message so we can see if this is being called we're going to say logger Dot info and then This is going to be get car details okay let me zoom out a little bit so we can read this a little bit easier and we'll just go through what we wrote so we imported all
the stuff we created the logger we use the database driver we Define our C details enum and then we start writing things like the initialization where we're going to store some context about the current car that we're looking at we have a method to get the current car string so this Just looks at these details converts it to a string for us we then have this uh lookup car which can be used by the llm this will take in a VIN which is a string and simply look for that in the database now let's just
quickly for a sanity check here make sure that the VIN in the database is actually a string so I'm going to go to my database driver and Vin text primary okay good just wanted to make sure that it wasn't an integer otherwise that could potentially Be problematic okay continuing here we then have yeah this to look up the VIN that's what we looked at we have another one just to get the current car details we have one to create a car which takes in all of the different parameters that we need and then lastly one
just to tell us if we currently have car detail so if we already have like selected a car or we have a profile already kind of pre-selected okay great so now that we have that this allows our agent to Simply use this so because we provided the assistant function here to the multimodal agent it can just go ahead and use any of this functionality so if we go and test this now it should just be able to create a new entry in the database for example or look something up so let's rerun our code here
and let's go back to our agent playground and let's just test this out and see if it works so I'm going to go here I'm going to go to connect let's give this a Second going to shut off my camera and let's see what it says Welcome to our Auto Service Center could you please provide the VIN vehicle identification number of your car if you don't have a profile yet just let me know and we can create one for you the VIN is 1 2 3 4 5 it seems we don't have a profile for
this Vin in our system would you like to create a new profile for your vehicle if so please provide the make model and Year of the car okay hopefully you guys can hear that all right but obviously you can read the text as well let's go back to vs code and let's just make sure that it actually is using this so as we can see here if we scroll through uh we should be able to see some logging messages although maybe we didn't set up the logger correctly let's see here so it says conversation item
created conversation item created but I don't See the logs related to it calling the function uh although maybe I'm just going to scroll through here and see if I can find this so I did find it here actually it says user data and you can see that it did actually indeed look up the car and then this is the VIN that it used to look up so I don't obviously that's not correct for looking it up so that's a mistake there um but you get the idea it did call the function so I'm going to
mess with this a little bit More I just want to make sure that it does call the function correctly so let's go back here to this and let's ask it something else can you make a new profile for me for a Jeep Wrangler 2024 with vin 1 2 345 your profile has been successfully created for your Jeep Wrangler 2024 how can I assist you further with your vehicle today so we can see that it has Jeep Wrangler 2024 but it's using a random Vin so I don't know why it's Using this probably because the VIN
I'm giving it is invalid so maybe if I did read out a proper Vin that is the correct number of characters it will use that one but for now this is fine we can proceed but obviously that's something we would want to check and look through because um that's not the VIN that I was telling it so for now let's proceed with the next steps and then we can deal with the incorrect VIN later on I don't want to spend too much time on it this second Okay so now that we've got this essentially what
I want to do is I want to change the agent's Behavior based on if we have a VIN or not so if we have a VIN already then I don't want to ask the user to provide a VIN again right I want to allow it to move on to kind of the next step and a lot of times when you have these voice assistants you have like a tree like flow that you want to go through right where after you ask this question you want to ask that Question after this question's asked you want to
ask the next question you get the idea so what we're going to do is we're going to use an event handler here from live kit so we can hook into what the agent does based on a specific response that's received from the user so I'm going to say at session. on and then I'm going to use the event user uh and then speech uncore committed okay and I believe I spelled that correctly now there's a bunch of other events that You can hook into but this is one where the user has finished talking once they
finish talking then the function that we write here will trigger so I'm going to say defined onore user undor speech undor committed and then what we're going to take in here is a message which is an lm. chat message okay now inside of this function what we're going to do is just look at the message and we're first going to convert it into a string in case it is a list because sometimes The message will come in as a list so multiple strings and then we want to combine those together so we're going to say
if is instance message. content and then list then all we're going to do is say message. content okay and we're going to say this is equal to back sln and then do join and we're going to join the following we're going to say in square brackets image if is instance and then this is Going to be X lm. chat image and this is with a capital you'll see how this works in one second and then this is going to be else X forx in message now all this is doing is we're just going to replace
whatever the message is which in this case is X with this image if it's an image so you can actually send images here to um this llm right like you can have video you can have image data so all we're doing is We're looking through the message we're saying okay if it's an image then we're just going to put the string image here because we not able to process the image how we're currently doing this otherwise we'll just take whatever the string message was and we'll just combine those with new line characters as our message
content hopefully that's clear but we're just adjusting the message content to essentially strip out the images and combine all of the individual string Messages together all right now what I'm going to do is I'm going to check if we currently have a car so if we've either looked up a car or created a new car so I'm going to say if the assistant function dot has underscore car that's why we wrote that method then what we're going to do is call a function here otherwise we're going to call a different function this allows us to
essentially Branch into two different paths where we tell the llm to do Something different so what I can do is I can say say Define findor profile and I'm going to take in message which is a string then I'm going to say sess session sorry do conversation Dot and this is going to be item. create and same thing as before what I'm going to do is take an llm do chat and this is a chat message okay and we're going to say R is equal to system and we're going to say the content is equal
to and then I'm going to provide some specific content Here I'm then going to say session. response. create okay now I'm going to make another function don't worry we'll discuss these more than one second and this other function is going to be called handle uncore query now same thing as before it's going to have this time a user message and the user message is just going to be the message. content okay now what we're going to do is if we already have a car we're just Going to handle whatever the user's query is so we're
going to call handle query and we're going to pass message like this okay this is the user's message and actually that reminds me I need to change the parameter type here to be lm. chat message uh because that's what this is okay so let's fix that all right so if we have the car we're going to handle the query otherwise we're going to say find profile and same thing we're going to pass the message the Reason being I don't want to allow us to redirect the user to some other department or have to have them
talk to someone until they give us the details about their car so whether they make the new profile for us or they look up the new profile with us we want to encourage them to do that first before we route them into this second response now both of these functions here you can do whatever you want but in my case I'm just going to provide a custom message To the user saying hey like we need to collect your profile before we're able to move on and then here for the handle query I just generate using
the original system message uh any response so I just pass in whatever their content was and then we reply to that but I could again put a custom message here and then I could continue and keep branching and if I wanted to in the handle query I could ask them a series of more questions before I then move them into whatever The next step is whether that's talking to a real person asking them to come into the showroom whatever it may be right but this is the branching system you can set up where you have
different functions that are handling the different types of requests that the user has based on some state in this situation the state is if the user has already told us information about their car or not okay so what I'm going to do now is I'm going to go to prompts and I'm just going to copy in another prompt here that we're going to use to handle looking up the users's VIN so this is going to be a function it says look up VIN message it takes some message parameter and it just says if the users
provided a VIN attempt to look it up they don't have a VIN or the VIN doesn't exist in the database create the entry blah blah blah you get the idea and then we embed the users's message great so what I'm going to do now is going to Pass that here so we need to import that so I'm going to import the lookup Vin message and for my content this is going to be look up VIN message as a function and then I'm going to pass in my message okay so what that's going to do is
then generate that string for me or the prompt that we pass to the llm so that should be it for our AI agent we can stop this and we can rerun it and we can now test this out and see if it works so let's go back to our agent playground And let's connect Welcome to our Auto Service Center could you please provide the VIN of your vehicle or let me know if you'd like to create a profile for your car the VIN is V12 345 it appears that we don't have a profile for a
vehicle with the VIN V12 345 would you like to create a profile for your car if so please provide the make model and year of your vehicle sure my vehicle is a Jeep Wrangler 2021 your Jeep Wrangler 2021 has been successfully registered in our system how can I assist you further with your vehicle today can I schedule a service appointment I can help you with that please let me know your preferred date and time for the service along with any specific concerns or Services you'd like to address okay sweet so it seems like that's working
I just want to go and look at the logs and see if it did create it with the correct Vin or not so I'm going to scroll through here and let's see if we can find this okay user data create car and yes perfect it used the correct Vin this time so what I'm going to do is I'm going to disconnect and reconnect and I'm going to see if it can look up this Vin for me I just want to test that feature so let's go back let's disconnect let's rejoin so we get kind of
the agent in the new state and let's see if we can look up that VIN Welcome to our Auto Service Center could You please provide the VIN of your vehicle or let me know if you'd like to create a new profile can you look up the VIN V12 345 it seems that we don't have a vehicle with that VIN in our records would you like to create a new profile for your car what is the VIN that you looked up the VIN I attempted to look up is 1 h GCM 82633 a12 3456 if that's
not correct please provide the correct Vin or let me know if you'd like to create a new profile please look up exactly V12 345 thank you for providing the VIN I have found the details for your vehicle make Jeep model Wrangler year 2021 how can I assist you further with your Jeep Wrangler today okay sweet so that's what I wanted to test guys sorry for going back and forth there it looks like I just need to be specific or Because I'm not giving a valid vin it's attempting to like shortcut it and look up a
VIN that contains those characters obviously that's not what we want so we could add some more specific uh instructions to the prompt but for now it's fine is's able to look it up right and it found that information and if we come back here we should be able to see the log if we look at user data look up car V12 34 five and then it gets that data okay sweet so that is it for the AI Agent now it's time to actually implement this on the front end all right so now we are moving
on to the front end now first things first I just want to take all of this backend code and just put it in a folder called backend just to clean things up a little bit so that's going to involve me stopping my agent from running right now so let's quit that and let's just grab all of these files and throw them inside of back end I'm going to throw my Virtual environment inside of there as well okay sweet so so that is inside a backend next thing we need to do is create the front end
folder I can delete this pie cach folder for right now okay let's get rid of that H and let's move to make our front end so to make our front end we're going to use react specifically we're going to do that with V at least for the command so I'm going to type npm create V at latest okay and then this is going to be front end Das Dash and then D Dash template and then react now in order for this to work you do need need mpm installed on your system you don't need to
be a pro and react here but obviously I'm not going to walk through all of the react code this is more just to show you how to get the voice assistant connected the tokens all of that kind of stuff by the way you can do this with pretty much any front end that you want so I'm doing it here with react but if we go back to the live Kit documentation here and we go for example let's go home let's try to find the front end stuff I'm just going to go front end integrating with
your front end if we click into this here you'll see that we have a lot of different options you can do with Android you can do with swift you can do with react uh there should be a lot other options as well in fact actually let's go back here and let's go SDK quick start and there you go so we have web sdks for next React JavaScript and unity you have native sdks for Swift Android uh compos Android flutter react native Expo and there's other sdks as well so if you want to learn how to
do this for a different front end you can just go to this documentation here and you can click into it I just followed along with the react documentation here and that's where you'll see a lot of this code comes from obviously I just made it a little bit more custom so I'm going to Walk you through that now okay sweet so let's go back here now what we're going to do is CD into front end we're going to type npm install and then we can run the server but we do need to install a few
dependencies for live kit as well okay so let's wait for that to finish okay so now we're going to run this command here npm install atli Kit components react live kit components Styles and then the live kit client and we are going to save that again this Comes directly from the live kit documentation so let's go ahead and run that install those dependencies that we need and let me close this file while we're at it okay so while that runs I'm going to go into my front end folder and I'm going to start setting a
few things up specifically I'm going to create a few components that we'll need and I'm going to bring in some CSS styles that I've already ridden which again you can find from that GitHub Link in the Description I just don't want to bore you by you know focusing too much on front end when really you guys care about the voice assistant integration so in my SRC file here I can go ahead and delete my assets I'm not going to need that I can delete the public folder as well I'm not going to need that one
and then we can continue from there so inside of my app.jsx I'm just going to clear all of this out I'm going to remove the state I'm going to remove the Import to the react logo and the v logo and I'm going to make a new folder inside of SRC called components okay now inside of components I'm just going to have two we're going to keep this nice simple one is going to be the live kit model this is the thing that's going to pop up when we want to connect to live kit and I'm
going to call this uh jsx okay then I'm going to make the live or sorry the simple if we can spell simple correctly voice assistant and this is Going to be jsx that's going to handle the logic for the voice assistant and then I'm going to do the same thing for CSS I'm going to say simple voice assistant. CSS and that's all we need for now now what I'm going to do is I'm going to copy in a bunch of CSS Styles here again you can find this from the link in the description in the
GitHub so just go there to the GitHub look in the components folder you can see the simple Voice Assistant CSS copy it into this File right for the app.css same thing I have some custom CSS so I'm going to copy it into this file and then same thing for the index.css I have some as well so I'm going to copy that into this file okay so now we have Index app and simple Voice Assistant CSS completed and now what we can do is move on to writing some of the code so what I'm going to
start with is my app.jsx inside app.jsx I'm just going to write a very simple landing page you can Make this really anything that you want and then we'll move on to actually implementing the live kit components keep in mind that live kit has all kinds of documentation here and if I go back you can see that what we're using is some components sorry from live kit which will handle things like rendering the audio for us rendering the video for us multiple tracks sharing participants if we actually go into the rack components they have a whole
library of Documentation here it explains how to do a bunch of stuff shows all of the components that they have like bar visualizers chat entry Focus layout so if you do really want to make a more advanced front end obviously check this out here um and you can you know use a lot of the pre-built stuff which is kind of what we're doing here okay so let's go back and what we're going to do is we're going to start by defining finding some state so we're going to say const Show support and then set show
support is going to be equal to use State and we're just going to make this equal to false this is what will show the like support model on screen we're then going to create a div here so we're going to say div and this is going to be class name equal to app and then inside of app we're going to have a simple header this will be class name equal to header inside of the header we'll have a div we'll say this is class name equal to Logo and then we're just going to call this
AutoZone that's the name of my website okay then we're going to have a main section for the main section we'll have a section this will be class name equal to Hero which will be like the top section on the page and then we can just put in some kind of like placeholder text here so we'll just have an H1 and we'll say get the right parts okay and we'll say right now that can be our slogan with the Correct Capitalization then we can have a simple paragraph tag and we can say free next day delivery
on eligible orders okay eligible how do you spell this let's fix that here there we go better at coding than spelling next we'll just have a simple search bar say class name e is equal to search-bar inside of here we're just going to have an input field okay we're going to say type is equal to text and We'll say the placeholder is equal to enter [Applause] vehicle spell this correct part number or we'll say vehicle or part number okay sweet and then we'll just have a simple button then we're pretty much done here we're going
to say button search okay and then outside of the section but still in the main I'm just going to make another button which will float in the bottom right this is Going to say class name equal support- BTN or should be actually fully spelled button and then here we're going to say talk to an agent exclamation point and we're going to have an onclick Handler here okay which will allow us to talk to the agent so what we're going to do here is just have a function and we're going to say con handle support click
okay and what this is going to do is say set show support equal to True okay and Then we're just going to use this and we're going to say handle support click so the idea here is that when you press this button it's going to set this state to true and then we're going to open up a modal which is going to get you to enter like your details like your name and then you can connect to the live kit room so what we're going to do now is we're going to design the live kit
model so we're going to go into this component here let's start writing that out so We're going to say import and this is going to say use state and then use callback okay from react we then are going to import some live Kip components so we're going to say import the live kit room and the room audio renderer from and this is going to be at liveit SL component D react and we need to spell this correct okay next we're going to say import and we're going to import these Styles so we're going to say
at Live kit components - Styles we then uh I believe actually that's it and then we'll have one other import but we don't have the component ritten yet so we don't need that now okay then we're going to say const live kit model we're going to take in our set show support okay so that we can close this model if they press the x button this will be a function and then what we'll do here is we'll start defining some of the information that we need in order to Uh what is it render the live
room so first things first we're going to say and we're going to use a bit of State here we're going to say is submitting name set is submitting name okay it's going to be equal to use State and this will be false we then need to know what the user's name actually is so we're going to say name and then set name is equal to use State and this will be an empty String okay after that we're just going to do a little bit of CSS here to kind of create a model that allows user
to enter their name and then Connect into into the live kit room so we're going to say return we're going to render a div we're going to say class name is equal to and this will be the modal Das overlay we're then going to make another div and we're going to say this is class name equal to the modal Das content we're going to have another div and this Is going to be the class name equal to the support- room okay then we're going to say are we submitting the name so if we are submitting
the name then what we want to do is prompt the user to enter their name so if is submitting name we're going to have a form we're going to say on submit okay and this is going to be equal to and we'll have a function here in a second and we're going to say the class name is equal to name- form inside Of the form we're going to have an H2 and we're going to say enter your name to connect with support we're then just going to have an input and sub buttons okay so we're
going to have input for the input we're going to have type is equal to text we're going to have the value equal to name we're going to have the on change equal to and we're going to take an e and then we're going to set the name so we're going to say set name e. target. value okay and Then we'll have the placeholder equal to your name and then we're just going to Define required and let me just format this so that we're able to read it so let's go format documents why is that not
working uh it's because we have some issue here okay it's because I don't have the else statement for my if here so let me just render a fragment and then we need to add the onsubmit function as well sorry I'm just trying to get this to format guys but it won't format if there's an error so I need to fix the error so for the error we're just going to say con handle and then this is going to be name submit is equal to this and then we can just pass the function handle name submit
and now I can format my document okay so you can see we have if we're submitting the name then what we're going to do is render this form I'm not Finish the form I'll write that in one second otherwise we'll render this okay so for the rest of the form we're going to say button and this is going to say connect and we're going to have the type equal to submit so when we press this it will then call this function here in the form we're then going to have another button we're going to say
type is equal to button and we're going to say the class name is equal to the cancel and then button and we're going to have Onclick is equal to and what this is going to do is call the set show support with false okay so this set show support is the function that's in our app.jsx so we'll then no longer render the live kit modal which we'll be rendering in just one second and actually let's just add some text to the button so you can actually see it so it says cancel now okay and then
we will form out again all right great now if we are not showing The form that means that we've already submitted the name then what we're going to do is we're going to show the live kit room okay now the live kit room is where we can render the audio and we can show all of the live kit related components so for that we're going to say live kit room sorry live kit room which we imported from live kit and there are a bunch of different props that we need to pass to this in order
for it to work so the first prop that we Need is the server URL for now we're going to leave that blank but we will fill that in a second we're then going to need the token same thing we're going to leave that blank this is a connection token or an authorization token we're then going to say connect is equal to true this means that as soon as the room is rendered we're by default going to connect to it then we can specify if we want to display video or not in this case I don't
want video so I'm going to Video false and then I want audio so I'm going to say audio is true then I need to have an on disconnected function here which we'll call when we disconnect from the room so when we disconnect from the room we're going to say set show support to false which means we're no longer going to show the modal and we're going to say set is submitting name to True okay this means that when we come back into this component we will be able to submit the user's name now inside of
Here what we're going to going to do is we're going to render the room audio renderer this will actually display the audio or allow us to hear the audio and then what we'll also need to do is render a few other components for controlling the audio so things like muting our microphone showing the different input sources showing the waveform of the agent's audio showing the text Data we're going to write that component in one second but this is kind Of stubbing what we need for the live kit room okay so before we go any further
I do want to test this uh in order to do that I'm going to export this component so I'm going to say export default and then this is the live kit modal and then I'm going to go to app.jsx and I'm going to render this modal here um at the bottom of the screen so in order to do that I'm going to go outside of this div and I'm going to say if show support so show support And and then we're going to show the live kit modal and we're going to pass to this set
show support is equal to set show support now the livek modal was automatically imported for us from do/ components so in order to run this and see if it works I'm going to type npm run Dev now this should run this on Local Host so let's open it up and you can see that we have our website and if we press on this button uh it looks like there's an issue with the modal so Probably just a problem with my CSS or for some reason it's not appearing correctly so let me see what that problem
is and I'll be right back so silly mistake we just need to set the is submitting name to true because we want that to be the default state so that it shows us the form and not the live kit room because in this case is showing the live kit room which is not going to work right now because we haven't connected it so now if I press this you can see That it shows me this there's some CSS related issue but the cancel button is working so let me try to fix that CSS I probably
just misspelled a class name or something so that it will appear and of course the modal was spelled as model not modal so let's fix that replace the E with an A and if we come back and we refresh and we click on this you can see that now we have the modal and everything is working and if we press connect then if we fill in the name so Let's do this here you can see that it just quits out because the live kit room is not working right now but you get the idea okay
so now if we want to actually connect to the live kit room what we need to do is get the live kit URL as well as a token and I'm going to discuss how the token works okay so first things first we're going to go here and we're going to make a new environment variable file in our front end so we're going to call this EnV now in EnV we're going to Pass a variable it's important it's called exactly this which is the vit live kit URL now this needs to be the same as what
we had in our back end so we can just go copy it and take this WSS URL okay so let's take that and paste that here for the live kit URL so that's how we know what live kit project to connect to but we also need to have a token now we're going to discuss the token in a second I promise for now though let's add the server URL to our Live kit room so in order to do that we can just do a set of uh brackets here curly braces whatever you want to call
them and then we are simply going to write the following importa env. vitore liveit uncore URL okay so this is how you import an environment variable inside of reactor specifically with v so that will import the server URL and then we need a token now the token is essentially our access credentials right It's the ability to connect to this room we don't just want anyone to be able to connect to any of our rooms so in order for us to connect from our front end to the live kit server we need to have some kind
of access credential now if we're just testing this we can get like a set of credentials that only last about 15 minutes that we can use to test this out however in a more production environment what we want to do is we want to have a backend server that issues us new Credentials anytime we want to connect to a room and then we have complete control of who is able to connect so the basic idea is that if you want to connect to live kit you need an access token the access token should be generated
from a backend server and typically in order for that to work you'd want the user to be authenticated right so maybe they would sign into the web page and then theyd be issued a token in our case we just want anyone to Be able to connect to the room so we can just issue a token to anyone but that token will specify what room they're able to connect to so with live kit you can have multiple different rooms and you can have multiple sessions going on at the same time we might not want someone to
be able to join room a but maybe we want them to be able to join room B so we can issue them a token that only gives them access to that specific room and then when they use that token To connect it will bring them into that room hopefully that makes a little bit of sense again it will be more clear in a minute but for now we're going to get a temporary token that will allow us access to the live kit room to test this on our front end and then I'm going to show
you the more permanent solution which is writing a very slim backend server which will issue the tokens for us so I'm going to go to my car service center and I'm going to go to settings And keys which is the page I'm already on okay so go to your project go to settings go to keys in live kit Cloud then you're just going to click on the key you're using I believe mine is this the Windows machine and you're going to press generate token now you just need to give this name okay so you can
see here that we've specified the username of the person that this token will be used for and the room that we want them to join and then we have the permission So if they can participate and publish tracks they can subscribe to tracks they can publish data so for example we could issue a token that would allow someone to join but not to talk right and you can you know check all of these different boxes and change the permissions so I'm just going to generate this token with all of the permissions allowed it's going to
give it to me and I'm going to copy it and this is what I'm going to use now in V Okay or in react so I'm just going to take the token and I'm just going to paste it here as my token to my live kit room and now this will actually allow me to join the room and I hear the audio okay so now that we have the token we could test this out but I'm realizing that nothing's going to happen at this point because we're not displaying anything on the screen and we're also
not running our agent so before we go further I don't want to jump the gun Here what I want to do is I want to just write this simple Voice Assistant component which will display some stuff on the screen so we can actually see what's going on and we know that this is indeed working so I'm sorry that we're not able to test it right now just we need some more components first so bear with me and let's get a few things on the page so I'm going to say import use and then this is
going to be Voice Assistant from uh live kit components React we're going to import the bar visualizer which will allow us to visualize the waveform we're going to import The Voice Assistant control bar the use track and this is going to be transcription and the use local participant and let me just format this so you guys are able to read it okay so these are some of the hooks and components that we're going to use we're then going to import track and this is going to come from the live Kit client and we're going to
import the use effect and the use State okay we're then going to import the CSS so we're going to say import do slash and then simple voice assistant. CSS okay great next thing we're going to make our components we're going to say const and then simple voice assistant and this will be equal to a component function and inside of here what we're going to do is we're going to display all of the Audio messages as text right all of the transcriptions but before we do all of that let's simply just try to render the control
bar just get something else on the page so that we can see something going on so let's render a div we'll give this a class name which will be equal to the voice- assistant Das container and let's make sure this is returned from the component okay then we're going to have another div we're going to say class Name is equal to and this is going to be the visualizer Das container inside of here we're going to put the bar visualizer but we'll do that in one second we're then going to have another div and this
is going to be class name is equal to the control section and in the control section we're going to display The Voice Assistant control bar which is just a pre-made component that provides a bunch of controls for us we're then going to Have a div we're going to say class name is equal to Comm conversation okay and for the conversation then we're going to render all of the different messages now that's all we need for now we're going to export this so we're going to say export default and then this will be the simple voice
assistant and for now this should work this should show us something on screen I could be wrong but let's try it out so let's go to our live kit model And let's bring in the simple voice assistant so we're going to say import simple Voice Assistant from simple voice assistant and let's put this beneath the room audio render okay so simple Voice Assistant now let's go back let's refresh let's go talk to an agent let's give a name and let's see if this works all right so silly mistake but we are forgetting to set the
is submitting name so I'm going to say set is submitting name to false like that so now hopefully This will actually do something for us and we'll be able to test this okay so let's refresh let's go talk to agent let's go name connect and okay so nothing's happening but we were able to get to this page and then I can disconnect and it closes that's what we're looking for the reason nothing's happening is because we're not running our agent so we're able to connect to the room but then we don't have any agent that's
connecting to the room with Us so that's going to be the next step run our agent and then hopefully we'll get some audio output okay so what I'm going to do is I'm going to CD out of this and now I'm going to activate my virtual environment again in my back end so I'm going to CD into back end say/ a/ scripts SL activate because I'm on Windows and I'm going to go python agent Dev okay now that's going to run our agent for us and now the agent is running and now what should Happen
is when I connect to my room the agent will connect as well so let's try this talk to an agent let's go name Tim and connect Welcome to our Auto Service Center how can I assist you today if you have a specific question or need help please provide the VIN of your vehicle if you don't have a profile yet just let me know and we can create one for you Hi how are you doing today I'm doing well thank you for Asking okay sweet so you can see that that's working now what I did to
get that to work I just paused the video quickly is I just need to restart my front end server cuz we had added this value in the environment variable file but because we hadn't restarted the server it wasn't loaded by V yet so when I first connected I wasn't getting anything and that's because we hadn't restarted the server so when I restarted the server everything worked and you can See now the voice assistant is connecting and if you go look at the logs of The Voice Assistant you can see everything that's happening here right like
what the agent is saying and all this kind of stuff okay perfect so now we want to continue building the front end and I want to make this uh simple Voice Assistant look a little bit better and I want it to display specifically all of the text that is um being said by both the person and the voice assistant So what I'm going to do is I'm going to make a new component called message now this is going to T take in the type and the text text of the message and just display it on
screen so we're just going to say div class name is equal to message we're going to have a strong tag here and this is just going to be the person's name so we're going to say class name is equal to message and then Dash and then this is going to be the dollar sign and then In Brackets type and we're just going to put this inside a set of back ticks actually so that this works so let's fix that and then this needs to go inside set of braces okay there we go so that's our
class name let's fix that and then we're going to have the following here we're going to say type is equal to agent question mark then we're going to say agent otherwise we're going to say you colon like that okay so that this is like the prefix of The message and then we're going to have a span and for the span we're going to have class name is equal to the message D- text and then we're going to put the text of the message okay just a simple sub component here for displaying our messages now we
want to get the transcription from our participants so to do this we're going to say const State and then audio track and then agent transcription is equal to use voice Assistant so the use voice assistant hook will look for the first agent participant in the room you could have multiple agents by the way but we're just having one and then it's going to get the state of the agent which will tell you if it's connecting if it's talking if it's listening you can actually use this state variable if you want to see that state you'll
get the audio track from the agent itself and then the live transcription of what the Agent is saying okay that's what used voice assistant does now we'll have the track from the agent but we also need to get the track from the person or the participant which is us so to do that we're going to say const local participant is equal to use local participant same thing this is going to get the local participant for us right and give us all this information like is this information enabled all that kind of stuff right and the
local participant Reference itself which is what we want okay next we're going to say const and then this is going to be segments colon and then user transcription is equal to use track trans transcription now for used track transcription this will give us the transcription of any track in this case we want the local participants track so we're going to pass inside of here publication and this is going to be the Local participant. microphone track okay then we need to provide the source the source of this is going to be track. source. microphone and then
we need to give the participant itself so we're going to say participant is equal to local participant. local participant I know it's a little bit confusing you can find this in the docs but this is how you get The local participants transcription we're then going to have some State and we're going to say const messages is equal to this and then set messages is equal to use State and this is going to be an empty array okay so now at this point we will have the agent transcription and the users transcription segments and what we're
going to do is just have a simple use effect hook which is going to look to see if any of these are changing and Then update the messages with those transcriptions so we're going to say const or sorry not const we're going to say use effect like this okay we're going to have our function and then we're going to have our dependency array and the dependency array will be the agent transcription or the user transcription so if either of these changes then we want to use this use effect hook we're going to say const in
here all messages is equal to and then We are going to unpack here or what do you call this uh the three dots I can't remember the name of it but you guys know what I'm talking about we're going to say agent transcription question mark do map and then this is going to be T colon equals and then bear with me here we're going to say dot dot dot T and then type colon agent okay and then we're going to have question mark question mark empty array in case that doesn't exist then we will do
the same Thing except this is going to be the user transcription and then the type is going to be user okay then we are going to sort these so we're going to say do sort a comma B we're going to say a doir and this will be received time minus b. first received time and did I spell received correctly of course I did not so let's fix the spelling okay so what is going on here what we're doing is we're getting all of the messages from the agent transcription and from the User transcription now this
comes in an array and it is like the sentences that they're saying so you'll see the agent will say one sentence then it we'll say the next sentence then it we'll say the next one and it's kind of streaming in these different replies so we're grabbing all of those and we're just mapping these and adding the type to all of the different messages that's what we're doing for the agent or for the user then we combine these together and We sort them by the first received time so we know who talked when and we have
the correct order on screen then we say set messages all messages and we just keep running this effect anytime the agent or the user says something that's all that's happening okay so that is our U messages hook now that we have that we can just start displaying some stuff visually so we can display the track of the agent so I can say bar visualizer and I can say state is equal to State And let's fix this and close the component I can say what else do we need here the bar count is equal to 7
so if you want it to be wider you can add more bars or like if you want to change the look of it you can change the number of bars that you're using can say the track reference is equal to the audio track which is coming from here this use voice assistant hook right okay so that will display The Voice Assistant audio you can do the same thing for the Participant we're just going to do with the voice assistant next in our conversation we're just going to display all of our messages so we can just
say messages. map and we're going to say message Das index okay and then this is going to map to a new message component and for the message we're going to say key is equal to the message. ID or the index here in case there's no ID we're going to say the type is equal to message. type and we're going to say the Text is equal to MSG do text okay so that should be it for the message and now everything should be displaying properly assuming our token is is still valid if not we'll get a
new token I'll show you that in a second so let's refresh this talk to an agent named Tim let's see if it works so we got some error there I assume that's because our token is expired and yes we got a 401 issue so this is what I was talking about before This token that I generated let me go back to where I get the token here only lasts for 900 seconds now that's because this is meant to be used temporarily and if you want a more permanent solution you're supposed to issue the token yourself
so so what we'll do for testing now is we'll generate a new token okay so I'm going to generate this token and this is now a new Fresh token that we can use but again it will expire in 900 seconds so it's not a permanent solution But for testing that's a fine way to do it okay so now I'm just going to replace this token here with this new one and if we go back here and we refresh and talk to an agent now ideally this should work let's try it Welcome to our Auto Service
Center could you please provide the V of your car vehicle identification number of your car if you don't have an existing profile with us just let me know and we can create one for You twet so that is working however the messages are not showing up provide so let me disconnect here and let's see what we did wrong with displaying the messages all right so silly mistake here but I need to call this agent transcriptions plural and then this needs to be agent transcriptions and then user trans transcriptions here as well I just forgot the
S so that's why we weren't getting the transcriptions from the agent so if I come back here Refresh try again this should work now so let's go Tim and run this and we got an error okay so let's see what the error is and there is an error occurring agent transcription is not defined okay so where did we have agent transcription that's right here so let's fix this to be agent transcriptions and user transcriptions apologies guys just a silly spelling mistake there try again and see Welcome to our Auto Service Center How can I assist
you today if you could please provide the VIN of your vehicle I'll be able to look up your profile and assist you further okay so a silly mistake here when it's not displaying the messages I forgot to return the div from this message component of course that would be the issue uh anyways now that we're returning that if we go back here and we go talk to an agent and we go Tim and connect let's see if this work Works Welcome to our Auto Service Center could you please provide the VIN of your vehicle so
I can look up so that's working now the reason you were getting that uh zero popping up is I was just debugging to see if the messages were actually working properly and they are so anyways you can see now the messages are appearing so this is working we pretty much have finished the front end the only thing that we need to do now is deal with our tokens so like I said what Needs to happen is we need to write our own server that can issue the tokens to our front end this gives us more
control over the tokens and allowing people to join the different rooms so we're going to go ahead and do that now from our backend so in our backend I'm going to make a new file called server.py now inside of here we're just going to write a really simple flask server uh that will allow us to issue the tokens to the front end so just bear With me while I write this out it's about 40 lines of code and then I will show you how we call this and how we dynamically get tokens and how you
can kind of have some more control over the rooms that are being created so what I'm going to do here I'm going to say import OS okay I'm going to say from live kit import API I'm going to say from flask import flask and request I'm going to say from. EnV import load. EnV I'm going to say from flask Cores import and this will be cores in all cap capitals I'm going to say from live kit. API import the live kit API I'm going to say from live kit. and actually we already have this so
we can just say list rooms request and I'm going to say import uu ID and this needs to be in lower cases okay this will allow us to create a unique identifier now all of these we already have installed in requirements so as long as we're using Our python virtual environment when we run the server then we be fine now what I'm going to do is I'm just going to say load. EnV and I'm going to say app is equal to flask and for the app we always initialize with underscore uncore name then we're going
to say cores and we're going to wrap our app and we're going to pass resources like this and it's going to be equal to the following regular expression string so bear with me here This is going to be R colon SL asteris colon and then we're going to have Origins colon asteris now what I'm doing is I'm just specifying that we're going to allow any other domain to access this back end you usually don't want to do this but I'm just making it uh explicit here that we'll allow anything to access it so that we
don't get any cross origin request issues okay or resource sharing issues whatever it's referred to as you guys have probably seen before if you Try to send a request from a backend or sorry from a front end to a back end and they're not on the same origin then you can get this chors error that's what this module is going to fix for us so that you don't get that when you run this server okay now all we need to do is run the server and make a simple rout that will issue a new access
token from live kit it's very easy to do this but just bear with me I'm going to say if name is equal toore maincore uncore then We're going to say app.run we'll run this on host so let's go host here equal to 0.0.0.0 we're going to run this on Port equal to 50001 and we'll just say debug equals true so that if we make any changes when we're running this it'll automatically be updated now we're going to define a single root in this back end and this root is going to be slet token okay and
all this is going to do is just issue a new access token Which will allow the user to connect to a new room now you have a lot of things you can do here with the live kit API on the back end you can list all of the rooms that exist you can delete a room you can create a new room you can list the different participants and you can imagine that if you were creating a more complex system here with the AI agent and you wanted to have multiple people for example in the same
room that's where you would do this you would set This up on your backend server so you really have complete control so what I'm showing you is a basic implementation keep that in mind so what I'm going to do here is to find ning function and it's going to be called get token okay now what this is going to do and actually let's go getor token to stick with python convention is we're going to get the name and the room that the user wants to connect to so we're going to say name is equal to
request Dogs.get and it's going to say name and then we'll just say my name as the default if you don't pass anything then we're going to say room is equal to request dogs.get and then room otherwise none now the idea here is that we're going to allow the user to pass as query parameters so the things that come after the question mark in the URL the name and the room the name is their name so we can identify the participant and then The room is the room they want to connect to if they have a
specific room they want to connect to so that's something that we can do we can allow people to connect to the same room now we're not going to do that right now but that is something you could do we're going to say if not room then room is equal to and we're going to call a function that'll write in one second which will be generate room name so anytime that you want to issue a new Access token you need to give a name or an identifier for the room that you want the user to join
in our case anytime someone presses that button we want to put them in new room like a random room name so that they're talking to um what is it a new AI agent because again you can have multiple people in the same room but if you did that then they would be in the same conversation chain as that AI agent that already existed in the room I understand it's a little bit Confusing you'll see what I mean in a second so we're going to write this function in a sec and then we're going to say
token is equal to API this is the live kit API right do access token and we're going to load into variables so we're going to say OS doget EnV and we're going to load the live kit _ aior key okay and let's copy this we're going to load the live kit API secret okay then I'm going to go down to the next line and I'm going to say dot with Uncore identity the identity will be their name we'll go down to the next line okay and we're going to say dot and this is going to
be with underscore name the name we'll go down to the next line and we'll say dot with unor grants this is the permissions and the permission will be api. video grants and for the video grants we're going to say room undor joyan is equal to true and the Room is equal to the room okay so this is just the default way to generate an access token what this is going to do is it's going to use the credentials that we have for our project our API key and our API secret it's going to identify the
user with this name and this identity you can change the identity and the name if you want but that's what we'll use for now and it's going to give them permission to just join this room that's all we're Doing there's more granular permissions that you can provide here but that's what we're doing with this token then we can return the token. 2or JWT which is a JWT token all right so now we just need to write this generate room name function so to do that we're going to write two functions we're going to say async
Define just going to write underscore room uncore name okay for the name we're going to say name is equal to room Das plus and Then we're going to take the string of a uu id. uuid4 and we're just going to grab the first eight characters of that this is going to create a unique identifier for us it's not always guaranteed to be unique if we just grab the first uh eight characters but in our case this is totally fine so we're just saying room and then some random string associated with it now we're going to
just make sure that this room is actually unique So we're going to say rooms is equal to await get rooms which is a function we're going to write in one second which will get all of the rooms from live kit and we're going to say while the name is in rooms then we're just going to generate a new name okay so we're going to go here and generate it again then we're going to return the name so what we're doing is we're generating this name for the room we're hoping it's going to be random we're
going to look At all the rooms that currently exist if this name does exist in the rooms it means it's not random so we're going to generate another one and keep going until we do get a random name okay next we're going to say async Define getor rooms and for this we're going to get all of the rooms currently active so you can do that by saying API is equal to live kit API we can say rooms is equal to await api. room. listor rooms and then we pass to this The list room request then
I can say wait api. close to shut it down and I can return a list of these rooms so I can say list and then this is going to be room name for room in rooms or rooms yeah rooms. rooms and actually we don't need to wrap this in a list because it's already going to be a list so all we're doing here is we're using the live kit API from the live kit API you have a bunch of different operations you can prefer form For example one of them here is listing the rooms that
currently exist you also can delete a room you can add a room you can do all that kind of stuff you can see the participants that are in rooms so I'm just showing you one simple example with API but again there's a lot more that you can do we grab all the room names and then we'll use that in the generate room name function okay so that should be it for our flask backend server essentially what happens now is Because we've used our credentials from live kit when we run this if we send a request
to get token it will generate a random room name for US unless we've provided room that we'd like to connect to and then it will allow us to connect to that using the token from our front end so now all we need to do is send the request from the front end to the back end so first things first let's run the back end to run the back end we'll need to make a new terminal so let me just Split this again let me go into my backend so CD backend let me activate my virtual
environment I'm going to try to make this a little bit larger so you guys can see these commands so do/ a/ scripts SL activate Okay me just clear this and then we're going to say python on server.py okay great so now the server is running and we have our agent running and we have our frontend running as well and these are kind of the three things that need to be running locally For this to work now what we can do is we can go back to our front end so let's close all of this and
we're just going to send a request to the back end so from our front end we're going to go to V.C config.js and we're going to set up an API proxy so here under plugins we're going to say server and we're going to say proxy for the proxy we're going to have SL API and we're going to say that if you're sending a requests to anything that starts with/ API we are going to Change the target to be HTTP colon localhost Port 50001 because that's the port of our flask server we're going to say change
origin is true and we're going to adjust the path so we're going to say rewrite and this is a function that takes the path and goes path. replace and then we're going to use this regular expression string okay so let me just write it out API slash like this okay so what is this doing Essentially all we're saying is hey if you try to go to any SL API endpoint uh when you're sending a fetch request here which you're going to see in one second we want to send that request to 50001 which is our
backend server we will change the origin so it's not whatever this is running on it's just this server and then we're going to rewrite the path so that we'll remove the/ API component of it and just send to whatever comes after okay this is required in order to Send the request without getting a chors error so now what we're going to do is we're going to go to our I believe it should be the live kit model and in the live kit modal we are going to get a token as soon as this modal is
opened up so we're going to say const get token is equal to use call back and we're going to have an async function the async function is going to take in the user's name okay and what it's going to do is Send a request to the backend to retrieve the token so it's going to say try const response is equal to await fetch okay and we're going to try to fetch SL API SL getet token but we need to put these in back TI so that we're able to embed some query parameters for our query
parameters we're going to say name is equal to and then we're going to have a dollar sign and we're going to say encode URI component and we're going to pass the user's name okay now after We do that we're going to get the token from response so we're going to say const token is equal to await response. text because we just reply with text which is the token and we're going to say set token token and we're going to add some state for storing our token so we're going to say const token set token is
equal to use State and we'll make this an empty string for now or actually we can make it null for now okay so we're going to set the token and then We're going to say set is submitting name equal to false okay otherwise we're going to have a catch and we can catch some error and we can just log it in the terminal so we can say console. error okay and we can just show whatever that error messages okay so this is our get token function and what we're going to do now is under our
handle name submit we're going to change this to call the get token function so we're going to say e. prevent default so that we don't uh Refresh the page and we're going to say if name. trim okay so if you do pass a name then we'll call this get token function with the user's name all right now this is a call back function and what it's going to do is it's going to send a request right it's going to grab the token for us and then we should be good to go now we just need
to do one thing we need to change the token here now to be the token from our state so that's coming from the back end all Right uh token sorry I guess I spelled token incorrectly so let's fix that all right so now token is there and then we just need to make sure that we're only going to show this if the token is uh existing so we're going to say question mark token question mark and then down here we're going to say colon and then null so we're just saying hey if we're submitting the
name show this if the token exists show this if the token Doesn't exist then just show n so we'll take a second to render this because obviously we need to grab the token first from this um request so that should be it uh let's just make sure all this is running still looks like it is and let's go back now and see without hardcoding our token if we're able to get this to work so let me go talk to an agent Tim and then connect and we got some problem something happened there uh or at
least it it's not showing the page Let me look at what the issue is here all right so two silly mistakes here I forgot to add the E to my handle submit and I forgot to add the dependency array to my Ed call back so because I didn't have the E it was just refreshing the page wasn't preventing the default Behavior so that's why this was not working so if I go back here and I refresh now and I go Tim and connect we can see it takes a second and then we got some issue
it says invalid Authorization token okay interesting that that token is not correct however it did look like it sent the request to the backend and that's why we got an issue in our backend server so it never actually gave us the correct token so let's go back to the backend backend server.py and we need to change this here to say a close not simply close online 24 await api. a close it's telling us that issue in the back end again apologies guys but this is just Part of the development cycle so that's why I leave
it in let's go to talk to agent let's go back to Tim and connect and see if it works Welcome to our Auto Service Center could you please provide me with the VIN vehicle identification number of your car if you don't have a profile yet just let me know and we'll create one for you can you make a profile please I'd be happy to help you create a profile could you please provide the Following details about your okay perfect so I'm going to disconnect from that cuz you can see that is indeed working and now
we are getting that token from the back end so again that is the advantage of using that backend server you now have complete control and you can do all kinds of fancy stuff if you really wanted to in terms of issuing the tokens connecting people to the same room all that kind of stuff and now if multiple people press that button at the Same time they're all going to get different access tokens with access to different rooms so they're all going to have their own instance of the AI agent running they're not going to be
using the same convers ational chain so that is something to keep in mind so with that said guys that will pretty much wrap this up I will let you know that if you do want to actually make this like a real call center you can do that with live kit so actually if we go into the Docs here let's go to the realtime API let's go to the multimodal agent there is a way that you can actually get it so you can call a phone number using something like twillo and have it um what do
you call here actually respond to you on the phone so you can see there's some stuff stuff with the Sip trunking if you go to this page in the docs so agent slopen telepon uh you can see kind of how this works and you can make it so that again if you call a Phone number then your agent actually connects to that room to the like you know cell room and starts talking to you so you could literally make this work over the phone not just on the front end UI but obviously that's what I
wanted to show you here cuz I imagine that's what most of you are looking for all kinds of cool stuff you can do with live kit this is obviously just scratching the surface and if you want to see more tutorials on live kit please do a let me know by Leaving a comment down below and I'm sure we can team up again and do another cool tutorial for you guys and show you maybe even some more advanced Behavior so with that said I am going to wrap it up here I hope you guys enjoyed all
of this code will be available from the link in the description and I look forward to seeing you in another video [Music]