Scribe
Scribe

Gostou? Torne o Scribe ainda melhor deixando uma avaliação

Obter Extensão do Chrome

Navegar

  • Vídeos Populares
  • Vídeos Recentes
  • Todos os Canais

Ferramentas Gratuitas

  • Baixador de Legendas de Vídeo
  • Gerador de Marcadores de Tempo de Vídeo
  • Resumidor de Vídeos
  • Contador de Palavras de Vídeo
  • Analisador de Títulos de Vídeo
  • Busca de Transcrições de Vídeo
  • Análises de Vídeo
  • Criador de Capítulos de Vídeo
  • Gerador de Quiz de Vídeo
  • Chat com Vídeo

Produto

  • Preços
  • Blog
  • Obter Extensão do Chrome

Developers

  • Transcript API
  • API Documentation

Legal

  • Termos
  • Privacidade
  • Suporte
  • Mapa do Site

Direitos Autorais © 2026. Feito com ♥ por Scribe

— Se isso tornou sua vida mais fácil (ou pelo menos um pouco menos caótica), deixe-nos uma avaliação! Prometemos que vai alegrar nosso dia. 😊

Related Videos

Qwen 2.5 VL Computer Use: FULLY FREE AI Agent With UI CAN DO ANYTHING! (Beats OpenAI Operator)

Video thumbnail
26.04k1,784 Palavras8m readGrade 18
Compartilhar
Channel
WorldofAI
a few days back I had made a video on a new remarkable model by Quin which was the 2. 5 Max Quin model that outperforms and out pieces deep seeks version 3 gbt 4 Omni plus claw 3. 5 Sonic and various benchmarks now in that video I had promised to make a video on another model they had released alongside with the 2.
5 Max model which is their quen 2. 5 Vision model this is their new flagship vision model of the coin series and it's also a significant leap from its previous coin 2vl model it's a great model for computer use like open AI operator and what's Wild is that the Quin 2. 5 VL 72b will perform as well as the base model for the operator agent meaning that you can use this local VL open- Source model to automate any computer-based task just take a look at the performance evaluation sheet that showcas es the performance of this Vision model where it delivers competitive performance across various benchmarks it excels in document and diagram understanding while functioning as a visual agent without task specific fine-tuning it matches state-of-the-art models in tasks like math question answering as well as video comprehension against the gp4 Omni model CL 3.
5 Sonet as well as its predecessor which is the quen 2vl it also closely Rivals Gemini 2. 0 flash where it out competes it in most benchmarks but Falls a bit short in triple muu now overall this model is open sourced with both base and instruct models in three different sizes including the 3B the 7B as well as a 72 billion parameter model which are available through hugging face you can also access the Quin 2. 5 VL 72b instruct model through quen chat which is their chat bot and in this case I'm going ahead and asking it what are these attractions please give their names in English and you can see list out giza's pyramids you have the Great Wall of China uh you have Statue of Liberty as well as Terracotta Army within China before we get started I got a huge new update this is where I've launched a new newsletter this is something that's going to be sent out on a weekly basis and essentially going to be updating you on the latest AI advancements comparison of different large language models AI news as well as ranking different AI agents so definitely go ahead and subscribe to this cuz you don't want to miss out on free AI news now guys when I say that this model is truly remarkable in terms of its visual understanding it definitely is cuz it has the capability to recognize objects analyze texts charts layouts within images and so much more better than any other Vision model can it functions as a visual agent for both computers as well as for phone use and it can comprehend long videos by pinpointing key events localizes objects with precise bounding boxes and it can even generate structured outputs for scan documents this is something that will make it highly useful in different categories like finance and e-commerce or having it so that it could help you process and sparse different types of data so you can see why this model could be really useful for many of us so imagine if we are to take this model's Vision capabilities and combine it with something like browser use which is a web automation framework you're definitely going to get the best accurate automation performance cuz as we all know browser use is a new framework that has came out recently which is open source and it basically is an AI agent that provides a powerful way for you to automate any website based task now you can see on the web agent accuracy benchmark test it is something that even outpaces the open AI operator which is recording an 87 percentage in terms of web agent accuracy in terms of Performing task and with browser use in comparison which scores an 89 percentage now just imagine if you are to combine this framework with the vision capabilities of quen you're definitely going to get the best precise output of any automation on the web which is just insane now there's a couple of ways to get started you can either use quen's API from their Cloud API platform to access the vision model capabilities through browser use or you can go ahead and install this inference based API server that is going to be accessible through open AI compatible API endpoint and to get started there's a couple of prerequisites for having browser use installed you're going to need to make sure that you have python installed UV to set up your virtual environment you'll need pip which is to help you install the packages and eventually you will also need git to help you clone the repository which is something that most people should have but once you have those prerequisites fulfilled head over to the web UI GitHub repository click on this green button copy this link to clipboard and I want you guys to open up your command prompt simply go ahead and type in get clone and paste in the link this will clone the repository of web UI once that is done you can then scroll all the way down and then you can type in CD web-ui to get into the web UI directory you want to then start off by creating your virtual environment for this so that it is contained in that environment then you want to activate your virtual environment so go ahead and send in this command which will activate this environment that you have set up once you have sent it in you can then install the dependencies so go ahead and copy this command now after you have installed all the dependencies you want to make sure you go ahead and install playright this is super simple you just need to go ahead and and copy this into your command prompt and this will install all the requirements for playwright and once that is done you can go ahead and open this up by going over and running this python script to start up the web UI with browser use that is functional and once that is done you can then go ahead and click enter and it will then open up within your Local Host within a couple minutes and then you can go ahead and open it up within your web browser now first things first you want to head over to llm configuration and obviously if you went along and set it up with the open AI compatible uh server or endpoint you will then need to select your model which is going to be the one that you just installed with it and then you want to set your Local Host to to the base URL that is said within that GitHub repository and then you want to leave your API key blank and then what you can do is head over to the browser settings as well as run agent to go ahead and execute any command so let's start off and test this we're going to go ahead and run this agent so that it goes over to World of AI my YouTube channel and it's going to go ahead and Source me the most popular video so right now you can see that it opened up a new tab to execute this and within a couple seconds it's going to go over to YouTube and find World of AI for me so within the search tab it should go ahead and search up world of AI and it will then go over to Source the latest or the most popular video so let's see what it ends up doing so it looks like it has found my channel and now it is going to sort through different videos to find the most popular one so it looks like the next step is going to be clicking on the popular button to sort through different videos that have the most amount of views and it should show up with fabric and there we go was able to find the fabric video that is the most popular on my channel so it was able to perform this task pretty quickly actually and it was able to do it quite accurately now I'm going to go ahead and send over this simple task but essentially what I'm trying to Showcase is that this model is exceptional in terms of going ahead and executing webbased task as it can go ahead and easily search for things quite quickly I went along and I requested it to search up trending AI research papers and we can see right now it is already on Google and it's going ahead to search this up now if you compare this with many of the other different types of computer agents it would take them a lot longer to process a query like this so in this case it's already went along and pasted in the prompt within Google search tab of trending AI research papers and already went along and searched it up and you can see there's already a list of different trending AI repositories that I can go ahead and click on if you like this video and would love to support the channel you can consider donating to my Channel Through the super thanks option below or you can consider joining our private Discord where you can access multiple subscriptions to different AI tools for free on a monthly basis plus daily AI news and exclusive content plus a lot more but guys that's basically it for today's video on the Quin 2.
5 BL model it's a truly remarkable model that excels in visual understanding and it's something that will help you recognize objects even better than something like Gemini 2.
Vídeos relacionados
This AI Technology Will Replace Millions (Here's How to Prepare)
53:17
This AI Technology Will Replace Millions (...
Liam Ottley
557,214 views
Even US Shocked by EU’s Surprise Ukraine Deal!
18:16
Even US Shocked by EU’s Surprise Ukraine D...
The Military Show
736,586 views
Attorney General Pam Bondi reveals when the Epstein files will be released
6:52
Attorney General Pam Bondi reveals when th...
Fox News
861,327 views
Deepseek R1 AGENTS: EASILY Build FREE AI Agents That CAN DO ANYTHING!
10:57
Deepseek R1 AGENTS: EASILY Build FREE AI A...
WorldofAI
15,929 views
Generate UNLIMITED AI Videos for FREE | AI Automation Secret | Qwen 2.5
5:20
Generate UNLIMITED AI Videos for FREE | AI...
Null face
4,648 views
Linus Torvalds Puts An End To Rust For Linux Drama
23:58
Linus Torvalds Puts An End To Rust For Lin...
Brodie Robertson
43,480 views
OpenAI o3-mini is a BEAST
27:40
OpenAI o3-mini is a BEAST
AI Search
363,407 views
FREE WebUI Beats OpenAI Deep Research & Operator!🤖 (ANY LLM) Open Source Browser Use AI Agent
16:36
FREE WebUI Beats OpenAI Deep Research & Op...
Josh Pocock
12,538 views
Jon Stewart Reworks Trump & Elon’s Sweeping DOGE Budget Cuts | The Daily Show
21:52
Jon Stewart Reworks Trump & Elon’s Sweepin...
The Daily Show
5,459,360 views
The new Qwen Model blows my mind! Qwen 72B 2.5-VL tested
17:13
The new Qwen Model blows my mind! Qwen 72B...
GosuCoder
4,774 views
Gemini Code Assist: FULLY FREE NEW AI Coder by Google! Github Copilot & Cursor Alternative!
9:27
Gemini Code Assist: FULLY FREE NEW AI Code...
WorldofAI
6,297 views
How to Build AI Agents with Jotform (FREE & No Coding Required!)
16:03
How to Build AI Agents with Jotform (FREE ...
Kevin Stratvert
20,478 views
Deep Research.....but Open Source
13:29
Deep Research.....but Open Source
NetworkChuck
261,860 views
CIVIL WAR: The Trump MAGA meltdown has begun
7:48
CIVIL WAR: The Trump MAGA meltdown has begun
David Pakman Show
532,574 views
AI Agents Fundamentals In 21 Minutes
21:27
AI Agents Fundamentals In 21 Minutes
Tina Huang
314,002 views
Google Changes Coding Forever by Making Gemini Code Assist FREE!
14:26
Google Changes Coding Forever by Making Ge...
Gary Explains
46,576 views
Top Open-Source GitHub Projects: AI, Robotics, Translation, & More #126
22:11
Top Open-Source GitHub Projects: AI, Robot...
ManuAGI - AutoGPT Tutorials
6,308 views
How To Build a Startup Team of AI Agents (n8n, OpenAI, FeedHive)
24:47
How To Build a Startup Team of AI Agents (...
Simon Høiberg
236,507 views
Deepseek-R1 Computer Use: FULLY FREE AI Agent With UI CAN DO ANYTHING! (Beats OpenAI Operator)
11:11
Deepseek-R1 Computer Use: FULLY FREE AI Ag...
WorldofAI
70,219 views
I built an AI supercomputer with 5 Mac Studios
34:57
I built an AI supercomputer with 5 Mac Stu...
NetworkChuck
678,638 views