Run AI Agents Locally with Ollama! (Llama 3.2 Vision & Magentic One)

Поділитися
Вставка
  • Опубліковано 23 лис 2024

КОМЕНТАРІ • 34

  • @userou-ig1ze
    @userou-ig1ze 2 дні тому +1

    Why are you like the only one reporting this in a timely manner on UA-cam. Who _are_ you and why am I not subbed. No really, community, who is this guy

    • @userou-ig1ze
      @userou-ig1ze 2 дні тому

      I look at this channel and it implements about 90% scary thoughts I had lately but never had time to implement- including bringing back people from the dead. Awesome!

    • @OminousIndustries
      @OminousIndustries  2 дні тому +1

      hahah I really appreciate the kind words! I like to make videos on new stuff I find cool and am glad others are interested too haha

  • @lesmoe524
    @lesmoe524 12 днів тому

    Very cool stuff, that magentic-one looks incredible.

    • @OminousIndustries
      @OminousIndustries  12 днів тому +1

      Thank you! Yes, it is really a fascinating thing to watch the agents interact and how the orchestrator gets them to do things, etc. This is something I am going to continue playing with as I have wanted an autonomous browsing agent like this for a while now.

  • @mice3d
    @mice3d 12 днів тому

    very cool! :) exciting to see how fast this is all moving!

  • @vinh-thuy7033
    @vinh-thuy7033 8 днів тому

    Ty a lot for your works !! Keep going !!

  • @DanishFarooqWani
    @DanishFarooqWani 11 днів тому +5

    Please let me know if you are planing to publish the changes on github? If already published please share the link. Thanks!

    • @OminousIndustries
      @OminousIndustries  11 днів тому

      @@DanishFarooqWani Hello, the repo is here, I had also shared it in response to your other comment, just want to make sure that did appear haha: github.com/OminousIndustries/autogen-llama3.2

  • @SmartRichLandlord
    @SmartRichLandlord 11 днів тому

    You have made a really great thing. The visual part is completed by Llama 3.2 Vision. Can the coder model be changed to another model with coding expertise?

    • @OminousIndustries
      @OminousIndustries  11 днів тому +1

      Thanks very much! Yes, I believe it can be and that is something I wanted to attempt myself, to essentially have a different model that may be a little smarter to handle certain tasks, leaving the vision model to handle web images. It would ofc be best to just use a large multi model like vision 90b but since that isn't realistic for a lot of people I think that perhaps delegating specific tasks like coding to a more domain specific model would be a good idea. It would be a bit of work to implement this but based off what I saw on the codebase it would not be unrealistic at all.

  • @themax2go
    @themax2go 10 днів тому

    very cool. can we use local model routing to so it uses a specific model for coding (deepseek coder), for img vision (llama3.2 / sam1 / ...), for vid vision (sam2 / ...), writing (ideally also differentiating between type of writing: technical, creative, summarization, ), img generation (flux dev / ...), ...?

    • @OminousIndustries
      @OminousIndustries  9 днів тому +1

      In theory, yes this is possible, though it would require some more in depth modification of the way the repo functions.

  • @DanishFarooqWani
    @DanishFarooqWani 12 днів тому

    Following on this amazing work done. Can you please provide the link of fork for this modification.

    • @OminousIndustries
      @OminousIndustries  11 днів тому

      Thanks very much, yes the fork is here: github.com/OminousIndustries/autogen-llama3.2

  • @themax2go
    @themax2go 5 днів тому

    i pulled the org repo and then updated the files from your repo, then i set the openai api key to "ollama" (w/o setting it i'm getting the error msg that it needs to be set) and i'm getting the error now: File "autogen/python/.venv/lib/python3.10/site-packages/openai/_base_client.py", line 1634, in _request
    raise self._make_status_error_from_response(err.response) from None
    openai.AuthenticationError: Error code: 401 - {'error': {'message': 'Incorrect API key provided: ollama. You can find your API key at ...
    - any ideas?

    • @OminousIndustries
      @OminousIndustries  5 днів тому

      Based on that, it seems like the src/autogen_magentic_one/utils.py file may not have been updated correctly. I would honestly clone my whole repo and try it to rule out an issue stemming from implementing the changes manually into the original repo.

  • @ewasteredux
    @ewasteredux 12 днів тому +1

    I think the only question I would have is how you eventually got it to work. I mean, it's great you got it working but I would love to get that same functionality working on my own hardware. Based on what I am seeing, you have a fairly beefy system so I am not sure I would have enough GPU power to replicate what you did.

    • @OminousIndustries
      @OminousIndustries  12 днів тому

      It is not as resource intensive as a lot of other models, which is nice. The only real stressor was the model in ollama which was showing me to be utilizing around 11-12gb of vram. The rest of my system is just generic components like a 12th gen i7, cheap motherboard and 32gb of ram. I see the blog post for the model says it requires 8gb of vram so perhaps there is a way to get it to work on that: ollama.com/blog/llama3.2-vision
      If that was a possibility, it would be quite possible to run this all on a relatively cheap (in terms of localllm) system with a 12gb 3060 card. As for replicating it on ones own system assuming the hardware is there, I think just installing it per the instructions in the magentic-one readme but with my fork instead would allow anyone to get it working without having to wrangle with any code.

  • @KostekSytnyk
    @KostekSytnyk 11 днів тому

    Hi! This new agentic framework looks really nice! I went through some obstacles with installation but now it's running smoothly. I wanted to see whether it's capable of scraping a dynamic content. I provided it a link where I asked it to scroll up the down of the page to extract all the submissions card titles. It was good until it had to execute the code the "coder" agent has written with selenium to scroll up to the down and the issue was in missing packages: "selenium" and "webdriver". It identified the issue and coder even provided steps to install those dependencies, however, it didn't manage to install packages and got hooked in a loop where it constantly asked me to execute the code, saw the error with missing packages, asks me again, saw the error, again and again.

    • @KostekSytnyk
      @KostekSytnyk 11 днів тому

      @OminousIndustries I really want it to do advanced scraping as it is a great use case!

    • @OminousIndustries
      @OminousIndustries  11 днів тому

      Interesting feedback and very valuable, thanks for that. I also find that it gets a bit confused when needing to install dependencies as I noticed the same thing, the coder would understand the missing package thrown by trying to run the code but after that would not really "get" needing to install it. I believe this may be a limitation of the model, though I wonder if it could be bypassed by manually installing the necessary dependencies for the scraping you will be doing in the docker that the agents are running in.

    • @OminousIndustries
      @OminousIndustries  11 днів тому

      Yes there are a lot of great use cases for this, especially web based.

  • @vijayamurugan271
    @vijayamurugan271 6 днів тому

    Very cool .it would be better if you could share the code .so that it would be useful for everyone.

    • @OminousIndustries
      @OminousIndustries  6 днів тому +1

      Thank you. The code is public and shared, you can see it here: github.com/OminousIndustries/autogen-llama3.2

  • @tommyholmberg
    @tommyholmberg 11 днів тому

    Do you have a github repo where I can copy your changes?

    • @OminousIndustries
      @OminousIndustries  10 днів тому

      Yes, here is a link: github.com/OminousIndustries/autogen-llama3.2

  • @epochgames3049
    @epochgames3049 11 днів тому

    Hey bud, you got a discord?

    • @OminousIndustries
      @OminousIndustries  11 днів тому

      I do, though I don't use it too often so it may take me a day or so to see new messages, username is: omns.ind

  • @user-jk9zr3sc5h
    @user-jk9zr3sc5h 12 днів тому

    Yeah, Qwen2-VL and Pixtral are better than 3.2V. You really need to use a larger model.

    • @OminousIndustries
      @OminousIndustries  11 днів тому

      Good points. I wanted to try with 3.2 vision to start but definitely going forward I want to use a larger model. Getting extra gpus for 90b is semi-tempting but I do not want to rebuild my pc hahah