I used the first AI Software Engineer for a week. This is happening.

Underfitted

Додати в
- Мій плейлист
- Переглянути пізніше
Поділитися

Поділитися

Вставка

Розмір відео:

Показувати елементи керування програвачем

Автоматичне відтворення

Автоповтор

Опубліковано 2 тра 2024
I teach a live, interactive program that'll help you build production-ready Machine Learning systems from the ground up. Check it out here:
www.ml.school
To keep up with my content:
• Twitter/X: / svpino
• LinkedIn: / svpino
Наука та технологія

КОМЕНТАРІ • 93

@tonywhite4476 3 місяці тому ⁺¹⁸
It’s not going to replace all software engineers but they won’t need as many.
@paulocacella 3 місяці тому
That is the correct point.
@moozooh 3 місяці тому ⁺³
The fact that it's disproportionately disrupting the entry-level jobs first is much more dangerous than simply removing the need for a percentage of jobs per se because it creates a barrier for entry in the profession that will affect every future generation (the further in the future, the more it will), to the point where it's just not financially viable for newcomers to keep investing in it (because you won't get your money back until after you've reached senior level).
@manuelmaxgonzalez2432 2 місяці тому
I think it is still a little too early to tell. It depends on how fast this things get better. A drastic improvement in productivity per SWE might enable a lot of proyects that were too expensive before and end up increasing demand. But if this tools improve very fast, then supply will flood demand.
@pabloarroyo7952 3 місяці тому ⁺²
Very good video. One to look back to in a couple of years time
@scretney1 2 місяці тому
Thanks, Santiago - excellent review of Devin. Appreciate you.
@javaparainiciantes 2 місяці тому ⁺²
02:45 - This is Devin - 1st test - mnist digit classification
04:59 - Devin ask for help
05:48 - Deploy in heroku
06:28 - Devin said it deployed but didn't
07:34 - Completed exercise but many dead code
09:15 - Second project: tic tac toe
10:21 - Ask Devin to move the button to below the board
11:55 - Devin deployed at netlify
12:05 - The third project: Lunar Lander Project
13:21 - Devin figured out that he had to migrate the TF version
15:30 - Impressive but disappointting. Devin broke the code
16:10 - Python Backend Implementation - take home assessment
17:54 - Improve the UI
18:21 - Final Example - RAG Example -Almost worked but he had closed the session
21:10 - Second try - complete failure
23:00 - Devin feels very slow
23:40 - Opinion: Biggest value of Devin
24:10 - Conclusion
@demianclarke 3 місяці тому ⁺²
Thanks Santiago for so valuable content. Un abrazo desde Barcelona
@underfitted 3 місяці тому ⁺¹
Gracias!
@inteligenciamilgrau 3 місяці тому ⁺⁵
The new kind of programmers are the ones who program AIs to program better for us!
@ShpanMan 3 місяці тому ⁺²
Yea, for 1-2 years. Then AI could do that too.
@24-7gpts 3 місяці тому
Awesome video!
@rsivakanth 3 місяці тому
Good one Santiago, you put Devin to test, for sure 🙂Albeit, this is reassuring and SW Developers/Engineers aren't at threat, yet ;-) Thanks.
@germainrodrigue367 3 місяці тому ⁺¹
Santiago, You're amazing 🎉
@pensiveintrovert4318 3 місяці тому ⁺³
Summary: it is not useable for what it was claimed. I have now spent 4 days playing with gpt-pilot with Llama 3 70b. Goes around and around, making mistakes, trying to correct mistakes, them doing this infinitely.
@lokeshsharma4177 2 місяці тому ⁺¹
I second every single word you said. I have Computer Engineering background with 28 years in the industries (although tech part was only for first few years) have seen transformation from OnPrem-SelfService-Cloud journeys and as you say Rightly CHIEF , this is nothing but marketing stunt at this time and an ambition where we (the Human) wanted to be in future. God Bless You
@divyapadhiyar9470 3 місяці тому
What a proper explanation and help us to learn about ai
@francescociulla 3 місяці тому
Thanks for sharign Santiago!
@doshin2019 2 місяці тому
Thanks for your review! I have a question regarding the LLM model used. While I understand these models are typically trained on open-source data (please correct me if I'm mistaken), I'm curious about the potential future implications.
What if, down the line, LLMs are trained on massive amounts of proprietary code? What kind of outcomes might we expect from such a shift? I'm interested in your thoughts on this.
@abudhabi9850 3 місяці тому ⁺⁵
So it kinda can solve somewhat easy problems while the solutions it creates are likely hard to maintainable and change. Nice for solutions which just have to work somewhat, however, when you require certainty you wouldn't want it to write your code.
Maybe Devin would really benefit from a "project cleanup" command before it delivers a project?
@davidcrocombe1322 3 місяці тому
I think these AI should always do a cleanup automatically, however if they don’t then we need to ask for it as standard procedure.
Come to think of it, we probably need to be specific about what cleanup we need - remove dead code, runtime performance, human readable code style & comments, dependencies allowed.
@carinebruyndoncx5331 3 місяці тому ⁺¹
As soon as you have a 3ork8ng program, tests automated , you can start refactoring and improving, look at the focus area of codium
@prasadghumare 3 місяці тому
Amazing!
@charith493-4 23 дні тому
Thanks a lot for this awesome content❤ It would be super helpful if you could make a video for people starting their IT careers in 2024. Maybe cover what areas they should focus on. Thanks again!
@underfitted 23 дні тому
I recorded a video on how to start. A roadmap. Check my past content.
@villanianalytics 3 місяці тому ⁺²⁷
This just goes to show that while AI can complete many tasks, right now there is a huge dependency on the user being knowledgeable about what is being requested. You got as far as you did because you were able to help point the AI in the right direction. Someone with no coding background wouldn't even be able to get a fraction of the progress you were able to get
@goatpepperherbaltea7895 3 місяці тому ⁺⁶
Yeah but rn computers take up a large room but one day they’ll fit in your pocket and be thousands of times faster
@RavishankarAyyakkannu 3 місяці тому ⁺¹
The same applies for generative music or image generation. You should be more proficient as an artist or musician to get what you want instead of some random cute generation.
@zedmor 3 місяці тому ⁺⁵
First customers of systems like this would be developers.
@Yomi4D 3 місяці тому
That's rn. This wi change.
@malartbecomes236 3 місяці тому ⁺²
You'd be surprised what beginner coders can get out of models with enough specificity, especially if you provide it with the right context. The issue is that the models aren't adept at finding, or more importantly, recognizing the correct, up-to-date and actionable information via search and RAG; without very specific instructions, they lack the sufficiently complex, robust memory and reasoning skills that humans do. I don't think we are ever going to get to the point where a human can provide a non-specific prompt and have the model intuit exactly what the human left out, unless we do something ludicrous like training models to be lifelong companions and pairing models and humans at birth. The whole approach is wrong.
We should be encouraging hallucinations and handling them differently. Not sure exactly how, but I know the FLARE framework tries to assess when a model is unsure about a token and uses that as an opportunity to perform RAG generation, but I think a much more effective method would probably be to allow the model to follow the alternative thought path (tangent), with some sort of way to summarize and classify the contents of the tangent, have another model attempt to verify the information, return the model to before the state where the tangent started, inject the information (along with the verification attempt) into some sort of internal thought register, so the model can 'register' the thought without compromising the current output, and then reassess the model's confidence in the next token. I know variations of this are already implemented elsewhere, but I don't think anyone is doing exactly this. It would be sort of similar to the tree of thoughts, but probably more robust, because it would bring up all sorts of other considerations to keep in mind, based on the problems the model ran into on each tangent.
This would obviously get very expensive so it's probably a crappy idea, but I like thinking of stuff like this.
I'm a beginner coder, if it wasn't obvious.
@AndrejsKarpovs 3 місяці тому ⁺¹
Would definitely use Devin in its current form to boost my learning!
@jofus521 3 місяці тому
Do you think for the lunar lander, it would be useful to have it write tests first, then refactor the code afterwards? Would it be capable of writing the tests based on its understanding of the code without running it?
@underfitted 3 місяці тому
I’m not sure. For the lunar lander, it’s a neural network what powers everything, so it would be very hard to test it with unit tests. More generally, tests can definitely help a tool like Devin
@ndrcntrl 3 місяці тому ⁺²
Excellent, thanks for the detailed preview of Devin! It’s definitely the real deal. Now I can begin to understand the incredible valuation of such an early stage company. So many tasks from my current dev backlog could be assigned to multiple instances of Devin running in parallel. I can dream of being freed up from many of those mundane dev tasks to pursue the fun and interesting aspects of projects with the help of an AI assistant like Devin. Super excited to get access, hopefully in the not too distant future. Great video, love your content 🤩
@carinebruyndoncx5331 3 місяці тому ⁺¹
I feel the same way, I think I am going to invest in a multisession setup to multitask with Devin, devika, ... the future of a software engineer desk will look more like a control room I think
@tomas0413 3 місяці тому
Hey, Santiago, great video! I’m still on a waiting list for Devin, but I looked at OpenDevin a few weeks ago. It was perhaps a bit too early and I plan to have a look at OpenDevin again. Any thoughts / plans on making a Devin vs OpenDevin comparison?
@underfitted 3 місяці тому ⁺²
That’s a good idea!
@henrymaddocks984 3 місяці тому ⁺¹
This is a great video. "Some weird things inside" is not OK though. This is why we have senior developers
@moozooh 3 місяці тому ⁺¹
This is the issue, though. Senior developers didn't start off senior; they were students, then possibly interns, juniors, middles, seniors. If AI disrupts this chain of skill cultivation by removing any need for internment and like 90% of juniors and some middles, how are they going to become seniors in the future? In fact, how would a future software engineer even enter the market and prove their competitive advantage?
@FergusMeiklejohn 3 місяці тому
What did it cost? I remember swyx said that Devin is expensive.. I wonder what the cost/performance would be if it used Llama3 70b through Groq
@underfitted 3 місяці тому
I got free access to it.
@felixronnoh 3 місяці тому
Nice review. Are you the first person to create the lunar lander?
@wwkk4964 3 місяці тому ⁺¹
Looks like Devin wrote India's 2019 lunar lander code too, it crashed!
@brucerosner3547 3 місяці тому ⁺⁶
I think this missies the whole point. Coding is a mechanical process readily automated. Software engineering comprises first generating requirements, that is, defining what is to be done and then selecting the most appropriate solution to meet the requirements. Defining requirement requires knowledge of the problem space not just computer knowledge.
@raymond_luxury_yacht 3 місяці тому ⁺¹
Yup. It's concept Vs production. Production is just factory. And only robots work in factories now. Yup. High level conceptual work is the high value for work. Which means you need an imagination, just like Einstein said.
@hansu7474 3 місяці тому
Add to that, coding is not mechanical process.. it's a ridiculous statement. I think if you're a software engineer you'd know it's not.
@bjrc 3 місяці тому
This is exactly what I've concluded over the past few months. But it applies to many domains, not just coding. Retail for example: existing LLMs can give a lot of high level information about how to optimise a retail organisation, but without being spoon-fed very carefully constructed reports and tools, it won't get anywhere. I hope it will get better with future LLMs, but for now they need a lot of guidance.
@patrickwhite9902 3 місяці тому
Soz if I missed it, but what LLM is behind the demo? I think the Devin mechanism is good but it's capability is model bound, right?
@underfitted 3 місяці тому
I’m not sure what LLM they use. I don’t know if they disclose that.
@24-7gpts 3 місяці тому ⁺¹
It's GPT 4 Turbo 2024 04 09 version
@BhargavSolankisolankibhargav 3 місяці тому
do you usually always ask remarakbly and grammatically correct prompts?
@underfitted 3 місяці тому ⁺¹
Only when I’m drunk
@tsaminamina_eheh 3 місяці тому
Do they use their own LLM or an existing one under the hood?
@riderjohnny5117 3 місяці тому
They use GPT-4
@underfitted 3 місяці тому
Personally, I don’t know.
@raymond_luxury_yacht 3 місяці тому
It's all about the fine tune. I expect ppl are working on really getting specific models expert in specific languages to write apps for particular contexts.
@tarekabiramia913 3 місяці тому
How much time did they take to give you the access ?
@underfitted 3 місяці тому
I reached out to them directly on social media. They probably gave me access because I have a large audience.
@tarekabiramia913 3 місяці тому
@@underfitted i highly appreciate your quick reply, so i need to wait in the queue 😅
@avi7278 3 місяці тому
Can you ask Devin to integrate Branch deep linking into a cross platform flutter application for ios, Android and macos? Their documentation is notoriously sh** and i want to see how's it handles it. I must admit that your example are closer to real world tasks thank most people out here trying to hype this thing, which is something that has always bothered me. The people trying it seem to have little to no real professional development experience. I'm not looking for a junior dev that i have to babysit.
@middle-agedmacdonald2965 3 місяці тому
Thanks, first video I've seen. I don't share your optimism about the future. The idea is to eliminate paying for labor, or to get it as cheaply as possible.
We're all guilty of wanting things cheap, so it's all of our faults.
@greg-guy 3 місяці тому
Can you share how much you paid for token of each of the projects Devin was working on ?
@underfitted 3 місяці тому
I got free access to Devin.
@goldmanguyok66292 3 місяці тому
add agent to remove unused code
agent for judging technology, which will be faster and easier
all your comments are easily solvable
@henrymaddocks984 3 місяці тому ⁺¹
After everything you saw I don't get why you think the quality of software will improve using these tools.
@underfitted 3 місяці тому ⁺¹
Because today is Day 1. How much do you think this will change in 5 years?
@raymond_luxury_yacht 3 місяці тому
What quality software. All the sw I use is crap. Bugs, design issues poor ux, worse ui. It's can't be any worse than the nonsense we already have.
@goldmanguyok66292 3 місяці тому
@@underfitted you can make agent to remove unused code. agent to judge technology..all your comments in the video are easily fixable. already in 1 year or less it will be perfected
@henrymaddocks984 3 місяці тому
@@raymond_luxury_yacht then make better choices.
@davidcrocombe1322 3 місяці тому
It changed your request of recognising 0 to 10 numbers to 0 to 9.
@underfitted 3 місяці тому
Yup
@surajm.s8561 2 місяці тому
thats a lot of tokens
@T___Brown 3 місяці тому
now you know how frustrating it is to be a BA. lol maybe devin should fix the BA first.
@ShpanMan 3 місяці тому ⁺¹
Haha, you are in the right direction but you don't appreciate how much smarter than humans AI will be in the coming years.
There will be no need for a human anywhere in the flow (well except for setting the goal). Give me an example of something a human would be needed for and recognize that future AI will do that faster, better, and cheaper.
Devin is just the beginning, it's cool, but you did recognize that improvements are a simple action of replacing the brain behind it with the smarter model - that's it.
The singularity is near.
@JD_2020 Місяць тому
Is this a paid promo?
@underfitted Місяць тому
No, it is not a paid (or unpaid) promo.
@dfsadsaaad 3 місяці тому
Well done. However, these tools will only improve over time, and eventually, humans will not need to write the code; they will only need to test it, assess it, and determine its usefulness. I have been writing code for more than 30 years and in more industries and more applications. Easy software jobs will disappear and only real "engineers" will remain. Self-taught techies or graduates from code academies should think about the trades. MORE software will not be needed in the future. AI's will do all of this on the fly on demand within 5 years. I have built a Devin-like system with CrewAI and it works better than Devin. Wait until GPT5.
@raymond_luxury_yacht 3 місяці тому
The web is dead that god. The future will be publishing content is uploading data to an embedding model which is borged into llm for rag. The output will be generated on the fly to suit the question. Eg spoke , generated video, text, music etc no more web interfaces. Web designers better start retraining.
@robertosolari__ 3 місяці тому ⁺²
So guys, learn maths, learn code, learn AI...
@ShpanMan 3 місяці тому ⁺³
More like plumbing..
@robertosolari__ 3 місяці тому
@@ShpanMan yeah, also. I was thinking about agriculture
@chimwemwechinamale6716 3 місяці тому
You really don't need to learn AI no need for that only a handful of individuals mostly in research and at big corps matter
@EduardsRuzga 3 місяці тому
AI will do math. AI will do code. AI will even do parts of entrepreneurship. AI already does AI :D Aka generates evals, synthetic data sets, picks model to fine tune, does that, runs evals, picks winners :D
But like with Devin, question is on speed, price. Some of the tasks like even large software engineering needs to deal with a lot of uncertainties, UX, GDPR, a lot of random variables from hardware to infra, to OS to software and frameworks and dependencies in the project and with 3rd party modules. There is a lot of work to go trough even for team of expert humans.
I do think we will get there in next 5 years. It feels like we are moving hard from imperative to declarative, and not in coding. Its about figuring out what to do, not how.
Question then is, what is not a commodity, where costs are, what is valuable.
Chips, compute, energy? Intelligence will be exchangeable for those. Kinda like now you can spend money to get back time by making other humans do things.
There are things AI will not change though. Like it can't do anything about land. Land is finite resources and we care more about some places then others, and AI will not change that drastically. Aka AI will not change laws of physics.
Weird times. I wonder if we can get to net 0 where tech can allow to get basic necessities close to 0 like food/shelter/health/education.

Наступне

Автоматичне відтворення

Building an AI assistant that listens and sees the world (Step by step tutorial)