hi this is NI from Lang chain over the next few videos we're going to be talking about query translation um and in this first video we're going to cover the topic of multi-query so queer translation sits kind of at the first stage of an advanced rag Pipeline and the goal of quer translation is really to take an input user question and to translate in some way in order to improve retrieval so the problem statement is pretty intuitive user queries um can be ambiguous and if the query is poorly written because we're typically doing some kind

of semantic similarity search between the query and our documents if the query is poorly written or ill opposed we won't retrieve the proper documents from our index so there's a few approaches to attack this problem and you can kind of group them in a few different ways so here's one way I like to think about it a few approaches has involve query rewriting so taking a query and reframing it like writing it from a different perspective um and that's what we're going to talk about a little bit here in depth using approaches like multi-query or

rag Fusion which we'll talk about in the next video you can also do things like take a question and break it down to make it less abstract like into sub questions and there's a bunch of interesting papers focused on that like least to most from Google you can also take the opposite approach of take a question to make it more abstract uh and there's actually approach we're going to talk about later in a future video called stepback prompting that focuses on like kind of higher a higher level question from the input so the intuition though

for this multier approach is we're taking a question and we're going to break it down into a few differently worded questions uh from different perspectives and the intuition here is simply that um it is possible that the way a question is initially worded once embedded it is not well aligned or in close proximity in this High dimensional embedding space to a docum that we want to retrieve that's actually related so the thinking is that by kind of rewriting it in a few different ways you actually increase the likel of actually retrieving the document that you

really want to um because of nuances in the way that documents and questions are embedded this kind of more shotgun approach of taking a question Fanning it out into a few different perspectives May improve and increase the reliability of retrieval that's like the intuition really um and of course we can com combine this with retrieval so we can take our our kind of fan out questions do retrieval on each one and combine them in some way and perform rag so that's kind of the overview and now let's what let's go over to um our code

so this is a notebook and we're going to share all this um we're just installing a few packages we're setting our lsmith API Keys which we'll see why that's quite useful here shortly there's our diagram now first I'm going to Index this blog post on agents I'm going to split it um well I'm going to load it I'm going to split it and then I'm going to index it in chroma locally so this is a vector store we've done this previously so now I have my index defined so here is where I'm defining my prompt

for multiquery which is your your assistant your task is to basically reframe this question into a few different sub questions um so there's our prompt um right here we'll pass that to an llm parse it um into a string and then split the string by new lines and so we'll get a list of questions out of this chain that's really all we're doing here now all we're doing is here's a sample input question there's our generate queries chain which we defined we're going to take that list and then simply apply each question to retriever so

we'll do retrieval per question and this little function here is just going to take the unique Union of documents uh across all those retrievals so let's run this and see what happens so we're going to run this and we're going to get some set of questions uh or documents back so let's go to langsi now we can actually see what happened under the hood so here's the key point we ran our initial chain to generate a set of of reframed questions from our input and here was that prompt and here is that set of questions

that we generated now what happened is for every one of those questions we did an independent retrieval that's what we're showing here so that's kind of the first step which is great now I can go back to the notebook and we can show this working end to end so now we're going to take that retrieval chain we'll pass it into context of our final r prompt we'll also pass through the question we'll pass that to our rag prompt here pass it to an LM and then Pary output now let's let's kind of see how that

works so again that's okay there it is so let's actually go into langsi and see what happened under the hood so this was our final chain so this is great we took our input question we broke it out to these like five rephrase questions for every one of those we did a retrieval that's all great we then took the unique Union of documents and you can see in our final llm prompt answer the following cont following question based on the context this is the final set of unique documents that we retrieved from all of our

sub questions um here's our initial question there's your answer so that kind of shows you how you can set this up really easily how you can use l Smith to kind of investigate what's going on and in particular use lsmith to investigate those intermediate questions that you generate in that like kind of question generation phase and in a future talks we're going to go through um some of these other methods that we kind of introduced at the start of this one thank you

RAG from scratch: Part 5 (Query Translation -- Multi Query)