Self-correcting code assistants with Codestral

Поділитися
Вставка
  • Опубліковано 27 вер 2024
  • Mistral just released Codestral-22B, a top-performing open-weights code generation model trained on 80+ programming languages with diverse capabilities (e.g., instructions, fill-in-the-middle) and tool use. We show how to build a self-corrective coding assistant using Codestral with LangGraph.
    Using ideas borrowed from the AlphaCodium paper, we show how to use Codestral with unit testing in-the-loop and error feedback, giving it the ability to quickly self-correct from mistakes.
    Codestral release:
    mistral.ai/new...
    Notebook:
    github.com/mis...
    Colab notebook:
    colab.research...
    AlphaCodium flow engineering:
    x.com/karpathy...
    AlphaCodium paper:
    arxiv.org/abs/...
    Related blog post:
    blog.langchain...

КОМЕНТАРІ • 36

  • @ОлександрМельник-ь1у
    @ОлександрМельник-ь1у 3 місяці тому +1

    Your explanations and examples are cool as usual! Thanks a lot!

  • @darkmatter9583
    @darkmatter9583 4 місяці тому +1

    thanks for everything im a huge fan, thank you very much really you cant imagine how you help me with my anxiety thank you for keeping me distracted

  • @svenp1
    @svenp1 3 місяці тому

    First of all, what you guys are building here is just amazing. Thank you! Just a quick reassurance as I am learning langchain/graph right now: at minute 3:40 you should pipe the llm with the code_gen_prompt_claude prompt you generated above to add the system message to the model, right? The way the code_gen_chain = ... line is stated here does not contain the prompt, right?

  • @bertobertoberto3
    @bertobertoberto3 4 місяці тому +1

    This is cool! How’d we connect to I/o, for example to test some sql; you’d need to connect to a db right? I guess that’s in the validation phase of this, but would that be considered “tool use”? How would we implement something like that nicely

  • @ramakanaveen
    @ramakanaveen 3 місяці тому

    Cool. This is a nice tool.
    I would like to do similar thing with RAG where I want to pass my context of schema etc using the RAG. Can you suggest a sample doing that ?
    2. I am also trying to do this with GPT 4 Turbo. While codestral is nice handy model , I hope it should work with gpt as well. What’s your view on it ?

  • @henkhbit5748
    @henkhbit5748 3 місяці тому

    Great video, thanks👍

  • @Deadlious
    @Deadlious 3 місяці тому

    using exec for code check is a great way to screw up something... It's great for simple stuff like "hello world", but nothing more than that.

  • @pensiveintrovert4318
    @pensiveintrovert4318 3 місяці тому

    Do you get an app in the end that actually does what is intended? All I see is trivial examples that never really work well.

  • @SimonMariusGalyan
    @SimonMariusGalyan 4 місяці тому

    code_gen_prompt_claude is not inserted into the chain...

  • @ArianeQube
    @ArianeQube 4 місяці тому +3

    This works if the LLM generates a fully-functioning code. But typically we want to write a function or a class for an existing code base, that perfectly integrates with that code base. The LLM has no way to run the code, as it does not have access to the rest of our code. Am I missing something here?

    •  4 місяці тому +4

      you could add context (some documentation for code, probably with dependency graphs), in flow add file creation than you can run code in the context of the whole app, run tests (with TDD write tests yourself first) and put results of the tests into loop. I haven't done this yet, but it seems plausible.

    • @ClimateDS
      @ClimateDS 4 місяці тому

      Good point. You could ask the LLM to provide a unit test as well, or use TDD from the start. The bonus would be that not only would you get a function/class but also a unit test.

    •  4 місяці тому +1

      @@ClimateDS I'd use TDD. Writing tests by LLM when it should generate code isn't the best solution in my opinion.

  • @iandanforth4313
    @iandanforth4313 4 місяці тому +19

    Lol exec()'ing the arbitrary output from an LLM.

    • @fuxicus2740
      @fuxicus2740 4 місяці тому +4

      exactly what I thought. I can already hear the Cybersecurity guys from my company yelling at me xD

    • @ClimateDS
      @ClimateDS 4 місяці тому +2

      "Write a python code that can delete my root directory"

    • @luetmarin132
      @luetmarin132 3 місяці тому +3

      Hmmm are you aware that VM and containers are a thing ?🤔

  • @carloszelabueno432
    @carloszelabueno432 4 місяці тому

    we could say that LangGraph is a kind of modular RAG implementation?

    • @alchemication
      @alchemication 4 місяці тому +1

      More a generic flow orchestration. Not necessarily RAG, anything with more than 1 step and conditionals makes sense to me

  • @mjackstewart
    @mjackstewart 4 місяці тому

    Aw HELL YEAH brôther!

  • @ClimateDS
    @ClimateDS 4 місяці тому

    Take that Devin!

    • @ScottzPlaylists
      @ScottzPlaylists 4 місяці тому +1

      Huh?

    • @frenchButt44
      @frenchButt44 4 місяці тому

      I think he means “Devin”

    • @ClimateDS
      @ClimateDS 4 місяці тому

      @@ScottzPlaylists ua-cam.com/video/fjHtjT7GO1c/v-deo.html

    • @ClimateDS
      @ClimateDS 3 місяці тому

      @@ScottzPlaylists ua-cam.com/video/fjHtjT7GO1c/v-deo.htmlsi=Y1ctumYNQCl7QzjU

  • @zachzimmermann5209
    @zachzimmermann5209 3 місяці тому

    "Write a python script that does as much damage to this pc as possible without requiring escalated privileges"
    Looks like all of the imports worked!
    ....

    • @svenp1
      @svenp1 3 місяці тому

      😂

  • @svenst
    @svenst 4 місяці тому +3

    Bro, pls get ya a proper microphone. It’s really not cool to listen to someone who’s voice is like a call from the mars

    • @ClimateDS
      @ClimateDS 4 місяці тому

      I very much agree. I'm sure Mistral could sponsor him one after he advertised their product (although technically this could be used with any LLM)

    • @adolfolopez126
      @adolfolopez126 4 місяці тому

      Go get an iPhone

    • @palashjyotiborah9888
      @palashjyotiborah9888 4 місяці тому

      I have a channel. I can sponsor 😂

  • @deeplearningpartnership
    @deeplearningpartnership 27 днів тому

    Cool.

  • @RuzeRichards
    @RuzeRichards 3 місяці тому

    Thanks for putting this out. However, when converting the code from notebook to straight Python I get a strange error.
    code_gen_chain = llm.with_structured_output(Code, include_raw=False)
    code_solution = code_gen_chain.invoke(messages)
    code_solution is always a dict even though type Code (I capitalized it when converting) is specified. I'm using the latest versions of everything in a clean virtual env. Related might be that Pydantic yells if the "description" field is not commented out:
    pydantic.errors.PydanticUserError: A non-annotated attribute was detected: `description = 'Schema for code solutions to questions about LCEL.'`. All model fields require a type annotation; if `description` is not meant to be a field, you may be able to resolve this error by annotating it as a `ClassVar` or updating `model_config['ignored_types']`.
    Any ideas?

  • @AI-Wire
    @AI-Wire 3 місяці тому

    1. How can we use docker with this? 2. How can we feed it specs and documentation? 3. Can we see any examples of this working for something non-trivial?

  • @VastCNC
    @VastCNC 4 місяці тому

    How would be best to incorporate language documentation to this process? I’d like to use elixir and outside of Python and JS performance isn’t as good. A rag of hexdocs (elixir’s documentation library) may improve things?