The Rise of the Evil Notebook Engineer | Databricks Notebooks | Data Engineering | Software

Поділитися
Вставка
  • Опубліковано 25 січ 2025

КОМЕНТАРІ •

  • @mattmartin5136
    @mattmartin5136 5 місяців тому

    Pure gold here 😄😄

  • @UnemployMan396-xd7ov
    @UnemployMan396-xd7ov 5 місяців тому +1

    That blog got you a subcriber

  • @apollon456
    @apollon456 5 місяців тому

    What I’m struggling to understand is how it’s even possible to make notebooks work in production such that there is an entire faction of people shilling for it. Like how are their companies even functioning when 9.999999/10 cases the notebook breaks because the cells are ran out of order or something

    • @NostraDavid2
      @NostraDavid2 5 місяців тому

      You can run notebooks (running the cells in order) in Databricks, as if it's production. I imagine they're nice prototypes, but manually tested code is just a long-term nono.

    • @davidleite7080
      @davidleite7080 5 місяців тому

      Databricks sort of encourages you to use notebooks all the way. Orquestrating them through workflows solves the cell execution order issue.
      But maintaining them is insane as the Workspace scales up.

    • @matthiaswarlop2316
      @matthiaswarlop2316 5 місяців тому

      @@apollon456 how do you run python code in your company?

  • @matthiaswarlop2316
    @matthiaswarlop2316 5 місяців тому

    I'm extremely inexperienced and have only used notebooks in the cloud (so for programs that need to run continiously). Is the alternative to notebooks just docker containers? How does a company generally run python code in production? I work at a startup and I'm the only technical guy.

    • @Eriddoch
      @Eriddoch 5 місяців тому

      "Is the alternative to notebooks just docker containers?"
      Oh man. No. Sorry, it's not you. It's just that this is a common question and shows how far away so many analyst-types are.
      The alternative to notebooks is small, modular .py files organized into something called a "package".
      Docker helps both notebooks AND packages be reproducible (if you leave and come back to your code 1 month from now, there's still a good chance you can install and run everything without fighting "Dependency hell")

    • @matthiaswarlop2316
      @matthiaswarlop2316 5 місяців тому

      @@Eriddoch yes obviously my python project makes use of packages. I'm more wondering how I run these continuously in the cloud. Do I let Microsoft host the docker container for me, or are there other ways of running my python project in the cloud?

    • @matthiaswarlop2316
      @matthiaswarlop2316 4 місяці тому

      @@Eriddoch Could you have a look at my previous reply?
      I'm still struggling with finding how you would run a python project in the cloud without docker containers.

    • @badjouras
      @badjouras 2 місяці тому

      ​@@matthiaswarlop2316 You can still use Databricks to run Python packages as workflows, for example. Instead of the workflows pointing to notebooks in a Git repo, they can point to a package.
      An alternative, if you're not using Databricks, is indeed to run containers, in Kubernetes clusters, for example.

  • @tonnybright4893
    @tonnybright4893 5 місяців тому

    Skill issue 😂