Scribe
Scribe

Ti piace? Rendi Scribe ancora migliore lasciando una recensione

Ottieni l'estensione Chrome

Sfoglia

  • Video Popolari
  • Video Recenti
  • Tutti i Canali

Strumenti Gratuiti

  • Scaricatore di Sottotitoli Video
  • Generatore di Timestamp Video
  • Riassuntore di Video
  • Contatore di Parole Video
  • Analizzatore di Titoli Video
  • Ricerca Trascrizioni Video
  • Analisi Video
  • Creatore di Capitoli Video
  • Generatore di Quiz Video
  • Chat con Video

Prodotto

  • Prezzi
  • Blog
  • Ottieni l'estensione Chrome

Developers

  • Transcript API
  • API Documentation

Legale

  • Termini
  • Privacy
  • Supporto
  • Mappa del sito

Copyright © 2026. Realizzato con ♥ da Scribe

— Se questo ha reso la tua vita più facile (o almeno un po' meno caotica), lascia una recensione! Promettiamo che ci renderà felici. 😊

Related Videos

What is RAG? (Retrieval Augmented Generation)

Video thumbnail
133.09k1,925 Parole9m readGrade 18
Condividi
Channel
Don Woodlock
hello everyone uh welcome to my code deare uh video series um what I'm doing is I'm rotating through three different types of topics educational topics uh use case topics and then kind of bias ethics safety uh topic so now on the education rotation and today what I wanted to talk about is uh what is retrieval augmented generation or rag uh and you may think that I'm going into some kind of nook and cranny of the AI uh field but this is a very important and popular kind of solution pattern that I see um being used
over and over and over again for uh how to leverage large language models so I thought I would explain it uh to you uh and the the the thing that this is used for is basically systems that leverage large language models but on your own content so let me describe that if you think of like the chat GPT experience and if you think about that um relative to like the search engine experience that we had before if you ask a question like um I don't know what color is the sky or how do I fix
this plumbing issue or something like that a search engine would go out uh or appear to go out search the internet find relevant content and then just list that content for you list those links and then you as a user would need to click on the links that seem seem right read it digest it and figure out the answer to your question what a large language model does is it seems to do that first part meaning leverage the content on the whole internet but instead of just listing that content it sort of digests it digests
it combines it assembles it together and answers your question sort of generates an answer um so it's a whole lot better I mean search engines have been great but this is taking the whole experience to another level and in addition the question and answering uh you can also give it instructions like write me this document or write me a lesson plan to teach geometry to seventh graders uh and it will do something similar it will kind of assemble content that it SE that it has seen uh that talks about geometry or seventh graders or how
to do lesson plans or whatever uh pulls that together assembles it and then writes out a lesson plan okay so it's a much better experience than just taking the raw content from the internet but it really uh creates something new from that now let's say you want that same experience but on your own content so it might be a chatbot on your website or you might have a library of PDF documents that this documentation for one of your products uh and instead of just linking the user to parag sections of the documentation you want to
actually answer their question uh it might be your service ticketing uh system so when a new issue comes in you could say how would I resolve this issue and it can assemble past similar issues uh and then come up with a new uh new solution based on that so this is an incredible experience that these large language models offer but how can you create that experience on your own content uh that might not be available to the internet or available to these large language models well the solution to this is this rag um architecture this
retrieval augmented uh generation architecture so now I'm going to do my best to explain that uh to you so let's say you have a um user and I'm going to use the example of a uh patient chatbot and the content source is going to be that content from your website let's say or could be content from PDF documents or or whatever but you want this to be the content to answer the patient's questions so if the patient has a question like how do I prepare for my knee surgery instead of just going to chat sheet
PT and getting a generic answer you'd like to provide an answer that's from your health system or a question like do you have parking you'd like to provide an answer for your health system for your the office where the patient is seen okay so that's a scenario that I'd like to do so the patient has a question uh and I'm going to do do you have parking have parking um you can uh imagine that question being bundled up into a prompt what's called a prompt and I'll describe this more later so there is the question
that prompt is sent to a large language model and that large language model will come up with a response to that question okay now um if you just wanted to use uh chat GPT let's say or some other llm uh without any extra content you could just use this flow how do I prepare for my knee surgery or do you have parking put that into a prompt send that to the uh large language model and get a response back okay but uh but what we want to do is enhance this experience with our own content
so let's say here is your content source and again this might be all the content of your website or PDF documents or internal ticketing system or databases or that uh that sort of thing and what you'd like to do is something called called the prop before the propt so in these systems you don't just send the user question to the large language model you usually have some level of instructions So the instructions might be you are a contact center specialist working for a hospital answering patient questions that come in over the Internet uh please be
uh nice to the patients and responsive and folksy because that fits with our brand or some instructions like that are sometimes sent with the prompt um and then uh Additionally you want to provide the information that the L llm needs to answer the question so what you'd ideally like is information from your website to be included here um and uh and that to be sent to the llm as well so the full prompt might be your instructions it might be something like please use this content um in order to answer the patient question at the
end and then you put in a bunch of information about parking or about knee surgery or whatever the patient asked you put that in the prompt before the prompt then you have the question then you send that whole package to the llm and the llm will give a great response based on your content okay with me so far so um so this notion is the prop before the prompt um and and that's why prompt engineering and these types of things are a big field right now now because you can really hone the um these systems
by doing a better and better job with the actual prompt before the prompt um in uh in this style now the last trick here is your website or your content is huge and it talks about all kinds of topics Beyond parking and Beyond knee surgery so you really want to somehow pull out only the parts of your content that are relevant to the patient's question so this is another um a tricky part of this whole rag architecture uh and the way that works is that um you take all your content and you break it into
chunks or these systems will break it into chunks so chunk might be a paragraph of content or a p or a couple paragraphs a page something like that and then those um chunks are sent to a large language model could be the same one or a different one and they are turned into a vector and uh so each each paragraph or each chunk will have a vector which is just is just a series of numbers and that series of numbers you can think of it as the numeric representation of the essence of that paragraph and
what's uh different about these numbers just they're not random numbers but paragraphs that talk about a similar topic have close by numbers they almost have the same vectors okay so in addition to the uh it's a numera Zed version of the paragraph but it's such that similar paragraphs on similar topics will have similar vectors will have similar numbers so that means that what happens is when um uh a user will ask a question like do you have parking let's say then that is also sent to the llm in real time right after the user asked
the question that comes up with the vector as well you could think of that as the question vector and then what happens we do we do a mathematical comparison real quick between the vector of the question and then the vectors of your content and pick like the top five documents that are closest to this question so do you have parking will be a vector then you have all your content and it's going to try and find the five documents that taught the most about parking basically um and so it'll find those I don't know what
that is it'll find those documents let's say uh from these it'll grab the paragraphs associated with those documents um and it'll use that here so those will be the subset of your content basically that is used as part of the prompt before the prompt okay so this whole uh concept is uh kind of vectorizing your content uh typically that then our storage in something called a vector database which is basically a representation of your content in this numeric form and then this system that you build this rag system will uh take the question find retrieve
the most relevant content make that as part of the prompt before the prompt send that to the llm and then you'll get a good response back actually so it's a little bit confusing but um but it's actually not that confusing um uh I just made it more confusing by this horrible uh horrible drawing but this whole thing is um what is uh called rag retrieval so you're retrieving the relevant documents from your content you're augmenting the generation process so you're augmenting the lm's ability to do generative AI based on the documents that you retrieve so
that's why it's retrieval augmenting generation okay so I hope that made sense uh like I said this is a very popular um solution pattern that I'm seeing over and over again in fact the majority of llm projects that I see are this kind of thing using my content packaging that up with an llm system to create a kind of chat chpt like experience for my employees or for my customers for my users that kind of thing and it works extremely well that's why uh that's why it's so popular so I hope that was interesting and
educational and made sense if you have any questions please leave them for me uh as part of the comments uh thank you very much
Video correlati
How to Use AI to Improve Patient No Show Rates
13:02
How to Use AI to Improve Patient No Show R...
Don Woodlock
9,652 views
How to set up RAG - Retrieval Augmented Generation (demo)
19:52
How to set up RAG - Retrieval Augmented Ge...
Don Woodlock
25,999 views
RAG Explained
8:03
RAG Explained
IBM Technology
75,747 views
Stanford CS25: V3 I Retrieval Augmented Language Models
1:19:27
Stanford CS25: V3 I Retrieval Augmented La...
Stanford Online
157,266 views
What are AI Agents?
12:29
What are AI Agents?
IBM Technology
230,441 views
Transformer Neural Networks, ChatGPT's foundation, Clearly Explained!!!
36:15
Transformer Neural Networks, ChatGPT's fou...
StatQuest with Josh Starmer
666,294 views
How to Improve LLMs with RAG (Overview + Python Code)
21:41
How to Improve LLMs with RAG (Overview + P...
Shaw Talebi
51,686 views
How to build Multimodal Retrieval-Augmented Generation (RAG) with Gemini
34:22
How to build Multimodal Retrieval-Augmente...
Google for Developers
54,491 views
What is generative AI and how does it work? – The Turing Lectures with Mirella Lapata
46:02
What is generative AI and how does it work...
The Royal Institution
976,942 views
A Survey of Techniques for Maximizing LLM Performance
45:32
A Survey of Techniques for Maximizing LLM ...
OpenAI
185,236 views
Discover Prompt Engineering | Google AI Essentials
30:30
Discover Prompt Engineering | Google AI Es...
Google Career Certificates
59,736 views
What is Retrieval-Augmented Generation (RAG)?
6:36
What is Retrieval-Augmented Generation (RAG)?
IBM Technology
677,495 views
Concerning the Stranded Astronauts
13:27
Concerning the Stranded Astronauts
StarTalk
473,572 views
AI Pioneer Shows The Power of AI AGENTS - "The Future Is Agentic"
23:47
AI Pioneer Shows The Power of AI AGENTS - ...
Matthew Berman
553,362 views
Should You Use Open Source Large Language Models?
6:40
Should You Use Open Source Large Language ...
IBM Technology
351,788 views
Vector Databases simply explained! (Embeddings & Indexes)
4:23
Vector Databases simply explained! (Embedd...
AssemblyAI
318,125 views
What is RAG?
0:53
What is RAG?
What's AI by Louis-François Bouchard
15,325 views
How to avoid bias in Machine Learning
14:46
How to avoid bias in Machine Learning
Don Woodlock
2,980 views
What is Retrieval Augmented Generation (RAG) - Augmenting LLMs with a memory
9:41
What is Retrieval Augmented Generation (RA...
What's AI by Louis-François Bouchard
33,138 views
Python RAG Tutorial (with Local LLMs): AI For Your PDFs
21:33
Python RAG Tutorial (with Local LLMs): AI ...
pixegami
210,323 views