Airflow Docker: run Airflow 2.0 in docker container

Поділитися
Вставка
  • Опубліковано 12 лис 2024

КОМЕНТАРІ • 86

  • @jhoanmartinezsilva2609
    @jhoanmartinezsilva2609 Рік тому

    My airflow environment was so slow, after this run like a charm, thank you!

  • @nishant2242
    @nishant2242 Місяць тому

    I think airflow installation is getting complicated nowadays for a developer with sound knowledge of infra , this video made my day

  • @tobys914
    @tobys914 2 роки тому +3

    Very clear and helpful tutorial so far, really appreciate it!

  • @TemporaryForstudy
    @TemporaryForstudy Рік тому +1

    Thanks man, you saved my life. Love from india

  • @pythonmathui3057
    @pythonmathui3057 Рік тому

    thank you so much sir, finally i got airflow installed well

    • @coder2j
      @coder2j  Рік тому

      You are welcome. 🤗

  • @LearnRight
    @LearnRight 2 роки тому +1

    Thankyou so much for the amazing explanation.

  • @barathelango267
    @barathelango267 7 місяців тому

    i like the way you say "excutable"

  • @zhalie12345
    @zhalie12345 3 роки тому

    Hello coder2j...
    Thanks for the clear explanation, I'm going to try this at home tonight. Gotta learn fast.
    Looking forward for more content! ^^

  • @artistry72
    @artistry72 Рік тому

    Great tutorial. amazing explanation. thank u so much

  • @mambomambo4363
    @mambomambo4363 8 місяців тому

    Came to learn airflow, stayed for boom 💥

    • @coder2j
      @coder2j  8 місяців тому

      🙌🙌

  • @Uxdkkkhcddvbh
    @Uxdkkkhcddvbh 3 роки тому

    This is soo amazing the best tutorial by far!! Thank you so very much!!!! amazing!!

    • @coder2j
      @coder2j  3 роки тому

      Glad it helped!

  • @fsyrhiz
    @fsyrhiz 3 роки тому

    hola coder2j, estuvo super! muchas gracias

  • @TamaraT-bn4lq
    @TamaraT-bn4lq Рік тому

    Hey and thanks for the tutorial! It is great! It also would be nice to see the terminal commands that you use in the videos. :)

    • @coder2j
      @coder2j  Рік тому

      Do you mean certain terminal commands are not visible in the video or you suggest having them in the video description?

  • @jaquesderasmo5496
    @jaquesderasmo5496 3 роки тому

    Great content. Pls keep doing it

  • @stevefernando291
    @stevefernando291 Рік тому

    Great vid! How can I remove properly it's postgresql service and volume? I am trying to just compose up a airflow service then hook it to my postgresql container, I kept getting error upon composing

    • @coder2j
      @coder2j  Рік тому

      You can remove the postgres definition in the docker compose yaml file.

  • @jhoanmartinezsilva2609
    @jhoanmartinezsilva2609 Рік тому

    Need this same to install kafka, could be possible a tutorial, thanks a lot

  • @VishalChandra-ku1qu
    @VishalChandra-ku1qu 3 місяці тому

    How can we setup this for multiple environments like Dev, Prod can you please guide us through?

    • @coder2j
      @coder2j  3 місяці тому

      You can use the same docker compose config and deploy them to different virtual machines or ec2 for staging and production environments

  • @bokistotel
    @bokistotel 2 роки тому

    THAT WAS GREAT! SUBBED!

  • @ooossssss
    @ooossssss Рік тому

    docker desktop is stuck on "starting..." I've tried pretty much everything suggested on stack to fix it(wsl --update). Any ideas? I'm on windows 10

  • @nicholascanova4250
    @nicholascanova4250 3 роки тому

    very helpful, thanks

    • @coder2j
      @coder2j  3 роки тому

      You are welcome!

  • @balive053
    @balive053 Рік тому

    If you want to keep it running on CeleryExecutor what is the difference in effect between that and LocalExector?

    • @coder2j
      @coder2j  Рік тому +2

      Using CeleryExecutor you have the possibility to scale up with more workers. But If you are running it on a single machine, there is not much difference as LocalExecutor.

    • @balive053
      @balive053 Рік тому +1

      @@coder2j Ok, thanks very much!

  • @yennhiluongnguyen9176
    @yennhiluongnguyen9176 6 місяців тому

    Can i ask why in the step install airflow in airflow_tutorial, I can open the web UI, but to the vid airflow_docker i cannot open the web UI although i have done exactly as you instructed. Please give me some helps, i have been stucking with that for 2 days

    • @coder2j
      @coder2j  6 місяців тому

      Please check the log in airflow webserver and see what is the error.

  • @MrPelastus
    @MrPelastus Рік тому

    Thanks for the video. It can't get clearer than this. I was wondering: What if I decide not to edit the docker-compose.yaml file? Does it really matter?

    • @coder2j
      @coder2j  Рік тому +1

      The only difference is that you will be use CeleryExecutor instead of LocalExecutor if you don't change anything.

  • @lucashuang2691
    @lucashuang2691 3 роки тому

    thanks for sharing

    • @coder2j
      @coder2j  3 роки тому

      Thanks for watching!

  • @goodmansaul5232
    @goodmansaul5232 Рік тому

    Hi coder2j, just want to ask if there is any big difference in running airflow on Kubernetes and on Docker. I know that Kubernetes can auto reallocate resources to other Pods when some other Pods re done. Would the airflow on docker do the same? Thank you so much!

    • @coder2j
      @coder2j  Рік тому +1

      They are different. Airflow on docker means running airflow in docker container runtime. Kubernetes is a tool to orchestrate container bases application running on a cluster of servers. Running airflow in (docker) container doesn't mean it has auto scale out of box, but it is a prerequisite for tools like kubernetes to manage it in scale.

    • @goodmansaul5232
      @goodmansaul5232 Рік тому

      @@coder2j Thank you so much for your reply!
      In practice, do we commonly used Kubernetes to manage the airflow in docker? I found it is fairly complicated to do that even we use the Helm chart.😅

    • @coder2j
      @coder2j  Рік тому +1

      It depends on the way you use airflow. If you outsource the heavy computation, like to spark cluster. Airflow is only doing the basic scheduling and management jobs, which don't need a lot of resources. Otherwise, you need to scale the airflow either using CeleryExecutor or kubernetes.

    • @goodmansaul5232
      @goodmansaul5232 Рік тому +1

      @@coder2j Thank you for your reply!
      I got your points, they really make sense. Thank you.

  • @anmoldhuwalia167
    @anmoldhuwalia167 4 місяці тому

    finally!!

  • @helovesdata8483
    @helovesdata8483 3 роки тому

    Boom!! I did it

    • @coder2j
      @coder2j  3 роки тому

      Nice to hear that.

  • @afamefunacharles7767
    @afamefunacharles7767 Рік тому

    Hi coders2j, is the password "airflow" in the yaml file different from the password of postgres running in the machine?

    • @coder2j
      @coder2j  Рік тому

      No, they are the same. Check if you already have postgres instance running locally on port 5432.

  • @MuhammadAkbar-ij4lm
    @MuhammadAkbar-ij4lm 2 роки тому

    I was wondering, why you deleted airflow worker on docker compose and what the reasons? Is it fine run airflow without airflow worker?

    • @coder2j
      @coder2j  2 роки тому

      If we use local executor, all the airflow jobs run in the scheduler container. Workers are needed if you use distributed setup, like celery executor for example.

  • @saisandeeppalla7713
    @saisandeeppalla7713 3 роки тому

    Thanks

    • @christopherkalolo1805
      @christopherkalolo1805 Рік тому

      Coder2j, thanks for this great video... Please, I am having problems with docker-compose up airflow-init. I'm getting this error consistently
      docker-compose up airflow-init
      [+] Running 0/15
      ⠹ postgres Pulling 10.1s
      ⠸ f1f26f570256 Pulling fs layer 1.2s
      ⠸ 1c04f8741265 Pulling fs layer 1.2s
      ⠸ dffc353b86eb Pulling fs layer 1.2s
      ⠸ 18c4a9e6c414 Waiting 1.2s
      ⠸ 81f47e7b3852 Waiting 1.2s
      ⠸ 5e26c947960d Waiting 1.2s
      ⠸ a2c3dc85e8c3 Waiting 1.2s
      ⠸ 17df73636f01 Waiting 1.2s
      ⠸ 124bb42a3852 Waiting 1.2s
      ⠸ dfb19482a052 Waiting 1.2s
      ⠸ bbb12a596105 Waiting 1.2s
      ⠸ aa8960c4e383 Waiting 1.2s
      ⠸ fdbdb6eba8dc Waiting 1.2s
      ⠿ airflow-init Error 10.1s
      Error response from daemon: pull access denied for extending_airflow, repository does not exist or may require 'docker login': denied: requested access to the resource is denied
      Any ideas please?

    • @coder2j
      @coder2j  Рік тому

      If you are running with source code from GitHub repo, make sure check this commit. github.com/coder2j/airflow-docker/commit/576fb2f78549c62d554e1675af0045956f7f0d69

  • @arjitbasu78
    @arjitbasu78 2 роки тому

    hey, I used your exact steps but my containers for the scheduler and webservice keep restarting. thus i cannot visualize anything!! please help

    • @coder2j
      @coder2j  2 роки тому

      What about your postgres container? Does it also keep restarting? Check and compare your docker-compose.yaml file with this github.com/coder2j/airflow-docker/commit/576fb2f78549c62d554e1675af0045956f7f0d69

  • @yogiananta9674
    @yogiananta9674 11 місяців тому

    how to install new python module in installed airflow via docker ?

    • @coder2j
      @coder2j  11 місяців тому +1

      You can check out this video ua-cam.com/video/0UepvC9X4HY/v-deo.html

    • @yogiananta9674
      @yogiananta9674 11 місяців тому

      @@coder2j i get isuue ModuleNotFoundError: No module named 'pymysql'
      even though I have added pymysql in the requirements.txt file
      PyMySQL==1.0.2

  • @mank5890
    @mank5890 3 роки тому

    I am trying to add a new dag in the dags folder, but I am getting "Import airflow could not be resolved" error in my vscode. Whats the best way to fix this? Thanks in advance.

    • @coder2j
      @coder2j  3 роки тому

      If you are running airflow in docker, your airflow package dependency is installed in docker container, which is not visible to VSCode. Therefore, you can either ignore the error or create a python environment in VSCode, install airflow and tell VSCode the path of your Python environment. The issue could be resolved!

  • @aldoaguirre9864
    @aldoaguirre9864 Рік тому

    how can I add some python packages, I mean pyspark, s3, and so on?

    • @coder2j
      @coder2j  Рік тому

      Check out this video: ua-cam.com/video/0UepvC9X4HY/v-deo.html

  • @evLover21
    @evLover21 3 роки тому

    Good to start 2.0; I have a question how to add python libraries into the image like usually we do RUN pip install

    • @coder2j
      @coder2j  3 роки тому +1

      The easiest way to do that is to extend the apache airflow official docker image. So basically you create a Dockerfile as following:
      FROM apache/airflow:2.0.1
      COPY requirements.txt /requirements.txt
      RUN pip install --user --upgrade pip
      RUN pip install --no-cache-dir --user -r /requirements.txt
      You will have to create a requirements.txt file in the same directory as the Dockerfile which will be copied into the image and installed.
      Then you use docker build command to build the extended image:
      docker build . --tag my_airflow:latest
      After that, you need to replace the airflow docker image name from the official image to your extended image my_airflow:latest in the docker-compose.yaml file. That's it, the rest steps will be the same, you call the docker-compose up airflow-init and docker-compose up to launch the airflow webserver and scheduler.

    • @evLover21
      @evLover21 3 роки тому

      @@coder2j Yes; I figured out that on same day just after posting my comments :-) . We have airflow 1.x setups in our project having everything in "requirements.txt" which executes by "entrypoint.sh" during container initialization (refer to 1.x git and entrypoint.sh); and we were struggling to add that way in 2.x poc environment. Later we have found all these details in 2.x git (refer to Dockerfile of 2.x)... but thanks for replying. Looking forward to see more videos like task chaining, dag chaining, dynamic task creations on the fly to leverage multi-processing in parallel. I am reading those from 2.x documents, but good to have those in videos. Thanks again.

    • @coder2j
      @coder2j  3 роки тому

      You are welcome! I am glad to hear that you found a solution. :-)

    • @coder2j
      @coder2j  3 роки тому

      @cookie you are welcome!

  • @harshithajnharshitha7838
    @harshithajnharshitha7838 9 місяців тому

    Is that necessary to install docker

    • @coder2j
      @coder2j  9 місяців тому

      Theoretically you can run it locally to follow the tutorial, but It is recommended to install docker as the following videos are running airflow in docker.

  • @noviandid
    @noviandid 3 роки тому

    i've been following your guidance, but when i'm about to test run dag manually, it always running but never finished... when i see .log file, it's some kinda looping... do you know why? thanks for the reply

    • @coder2j
      @coder2j  3 роки тому

      It's hard to tell where exactly went wrong with the info provided. I think you can try to check your dag implementation. It might have some loop logic that never stops.

    • @noviandid
      @noviandid 3 роки тому

      @@coder2j i'm running example_bash_operator dags

    • @noviandid
      @noviandid 3 роки тому

      im sorry, it's my bad, i didn't turn on the dag and just found out it won't running even you click it manually... sorry beginner error

  • @anjanashetty482
    @anjanashetty482 2 роки тому

    Airflow installing on docker gives message to upgrade airflow db. But when I try airflow db upgrade it get error: airflow command not found. Please help

    • @coder2j
      @coder2j  2 роки тому

      Can you share your docker compose yaml file?

    • @anjanashetty482
      @anjanashetty482 2 роки тому

      @@coder2j I was able to start airflow webserver, I had to enable my permissions to my path. But, now I ran into different error: "latest-test-repo-airflow-webserver-1 | error: option --workers not recognized" and "latest-test-repo-airflow-scheduler-1 | error: invalid command 'scheduler'". Also, please can you let me know how and where can I share docker-compose yaml file with you.

  • @TvsCar30
    @TvsCar30 Рік тому

    when I run the 'airflow webserver -p 8080 ' command: error import pwd
    ModuleNotFoundError: No module named 'pwd'
    I need some help!!! thanks

    • @coder2j
      @coder2j  Рік тому

      Why do you need this command if you are running airflow in docker?

  • @TanushreeNagar-tt1pq
    @TanushreeNagar-tt1pq 6 місяців тому

    unable to install docker

  • @satishprasad4663
    @satishprasad4663 Рік тому

    username is airflow but is the password?

    • @coder2j
      @coder2j  Рік тому +1

      The password is also airflow in the demo I shown.

  • @FauzanoMohammad-b7m
    @FauzanoMohammad-b7m 9 місяців тому

    i forfot to input -d

    • @coder2j
      @coder2j  9 місяців тому

      Without -d, the container will run in the foreground.