you are a life saver sir, i was not able to get my airflow scheduler up no matter what, then i followed your tutorial and it's finally running. Also for mac os users, when creating a virtual env, make sure that the python is pointed to the homebrew bin python installation,
Thank you very very very very much. I was looking everywhere for such an excellent tutorial video but was always getting stuck somewhere. This is gold :)
It's really amazing. Everything is clear und structured. I can easily follow your step to learn and finish my work. Thanks a lot! By the way the design is also very good. : )
You are welcome! This is a good question, you can read my blog post at here: coder2j.com/apache-airflow/apache-airflow-introduction-and-local-installation-guide/
Hello coder2j, can you please do a lesson on copy data from an API and using it in airflow. For instance, copying sales data and weather forecast data into airflow to train a machine to predict sales over a particular season of the year. It could be the sales of umbrella, a certain type of clothing, cream or even ice-cream.
Nice Tutorial. But i m trying to install airflow in my windows machine. But when i tried to start to webserver its prompting me like there is no 'pwd' module to start the webserver. Could you please explain or provide any reference on how to start airflow webserver in windows boxes.
I have couple questions - 1. How can we change the default DB from sqlite to postgres ? Like execute all the migrations in postgres instead of sqlite. 2. Can we connect to a local postgres instance from Airflow->Admin->Connections ?
Thanks for posting. 1. You can check the part 2 tutorial which uses postgres db as the backend. You can migrate changes from sqlite into postgres if needed. 2. Already working on it. Stay tuned!! 🙌
Hi great video, however i am stuck at installing Airflow @5.59. Modulenotfound error: no module named wtforms.compat. I have installed the constraints properly and am still receiving the error. Please help
Hi, I am getting this error The conflict is caused by: apache-airflow 2.5.2 depends on python-daemon>=3.0.0 The user requested (constraint) python-daemon==2.3.2 To fix this you could try to: 1. loosen the range of package versions you've specified 2. remove package versions to allow pip attempt to solve the dependency conflict
Many thanks I am at a beginner stage what is the skills required to be a junior data engineer I am learning the following python sql fundamental of data warehouse course I found udemy and apache spark will that be sufficient for me to secure a junior position In addition do three projects
In my opinion, junior data engineer need to have the Python coding skills, know how to manipulate data with various sources like SQL server, MySQL, PostgreSQL, ETL platform like Airflow, cloud services experience like AWS, Google Cloud or Azure, and basic software engineering knowledge like how to write clean code, unit test, CI/CD Pipeline. Familiar with Apache Spark, docker would be a plus.
@@coder2j I spoke to several data engineer I was told python,sql and data warehouse is sufficient for junior role as for Cloud, airflow and apache spark is based on company which tool they use I will learn airflow and basic spark using python api Thanks
Thank you for the course .. Trying to run it locally, but I'm getting this error message "{webserver_command.py:252} ERROR - [0 / 0] Some workers seem to have died and gunicorn did not restart them as expected" => Please let me know how to bypass this .. thanks
iam facin this error when i tried to type "python3 --version" *Python was not found; run without arguments to install from the Microsoft Store, or disable this shortcut from Settings > Manage App Execution Aliases.* can you help me in this?
@@coder2j I'm getting this error frequently Please advise The scheduler does not appear to be running. The last heartbeat was received 25 minutes ago. The DAGs list may not update, and new tasks will not be scheduled.
i get this exception when i run the command " airflow db init" ........ EXCEPTION : File "/c/projects/airflow/setup1/airflow_env/lib/python3.10/site-packages/airflow/www/app.py", line 84, in create_app raise AirflowConfigException( airflow.exceptions.AirflowConfigException: Cannot use relative path: `sqlite:///./airflow.db` to connect to sqlite. Please use absolute path such as `sqlite:////tmp/airflow.db`.
IT Gives me error :- PS D:\Airflow> source py_env/bin/activate source : The term 'source' is not recognized as the name of a cmdlet, function, script file, or operable program. Check the spelling of the name, or if a path was included, verify that the path is correct and try again. At line:1 char:1 + source py_env/bin/activate + ~~~~~~ + CategoryInfo : ObjectNotFound: (source:String) [], CommandNotFoundException + FullyQualifiedErrorId : CommandNotFoundException
a me i comandi "airflow db init" e "airflow webserver -p 8080" non vanno quando mi trovo in py_env, mi sai dire perchè? quando lancio "ariflow webserver -p 8080" mi escono questi output: Traceback (most recent call last): File "/usr/lib/python3.10/runpy.py", line 196, in _run_module_as_main return _run_code(code, main_globals, None, File "/usr/lib/python3.10/runpy.py", line 86, in _run_code exec(code, run_globals) File "/home/amb/.local/lib/python3.10/site-packages/gunicorn/__main__.py", line 7, in run() File "/home/amb/.local/lib/python3.10/site-packages/gunicorn/app/wsgiapp.py", line 67, in run WSGIApplication("%(prog)s [OPTIONS] [APP_MODULE]").run() File "/home/amb/.local/lib/python3.10/site-packages/gunicorn/app/base.py", line 231, in run super().run() File "/home/amb/.local/lib/python3.10/site-packages/gunicorn/app/base.py", line 72, in run Arbiter(self).run() File "/home/amb/.local/lib/python3.10/site-packages/gunicorn/arbiter.py", line 58, in __init__ self.setup(app) File "/home/amb/.local/lib/python3.10/site-packages/gunicorn/arbiter.py", line 118, in setup self.app.wsgi() File "/home/amb/.local/lib/python3.10/site-packages/gunicorn/app/base.py", line 67, in wsgi self.callable = self.load() File "/home/amb/.local/lib/python3.10/site-packages/gunicorn/app/wsgiapp.py", line 58, in load return self.load_wsgiapp() File "/home/amb/.local/lib/python3.10/site-packages/gunicorn/app/wsgiapp.py", line 48, in load_wsgiapp return util.import_app(self.app_uri) File "/home/amb/.local/lib/python3.10/site-packages/gunicorn/util.py", line 412, in import_app app = app(*args, **kwargs) File "/home/amb/.local/lib/python3.10/site-packages/airflow/www/app.py", line 181, in cached_app app = create_app(config=config, testing=testing) File "/home/amb/.local/lib/python3.10/site-packages/airflow/www/app.py", line 99, in create_app raise AirflowConfigException( airflow.exceptions.AirflowConfigException: Cannot use relative path: `sqlite:///./airflow.db` to connect to sqlite. Please use absolute path such as `sqlite:////tmp/airflow.db`.
@@coder2j il problema sembra essere il comando export AIRFLOW_HOME=., nel momento in cui lancio il comando AIRFLOW_HOME=~/airflow tutto ritorna a funzionare ma non credo sia la soluzione giusta
can help me ? i got this when instaling airflow db init "WARNI [airflow.models.crypto] empty cryptography key - values will not be stored encrypted." the new folder log, airflow.cfg and airflow.db not creating and I use wsl 2 in windows 10
@@coder2j i type 'export AIRFLOW_HOME = /' before install apache airflow then airflow db init. can you tell me how to see if AIRFLOW_HOME is in right directory ? ah the same warning but folder log, airflow.cfg and airflow.db is created Thanks
you are a life saver sir, i was not able to get my airflow scheduler up no matter what, then i followed your tutorial and it's finally running. Also for mac os users, when creating a virtual env, make sure that the python is pointed to the homebrew bin python installation,
You are welcome! Glad you get Airflow running finally. 🙌
Thank you very very very very much. I was looking everywhere for such an excellent tutorial video but was always getting stuck somewhere. This is gold :)
Thank you for your warm comment.
you're such a great teacher. please make more videos about airflow. it will be based if you make a project-based air flow tutorial
Thanks, there will be more videos coming. Stay tuned!
It's really amazing. Everything is clear und structured. I can easily follow your step to learn and finish my work. Thanks a lot! By the way the design is also very good. : )
Thanks for your nice comment! I am glad that it helps.
Thank you !
Clear, correct, and short =)
Glad you like it.
simply an amazing tutorial for beginners, congratulations!
Thanks for your nice comment! 😁
thank you so much sir, this vedio was very useful:)
You are welcome 🤗
Thanks for the video - nice explanations! Would be curious to understand what your opinions are around pros & cons for using Airflow?
You are welcome! This is a good question, you can read my blog post at here: coder2j.com/apache-airflow/apache-airflow-introduction-and-local-installation-guide/
How to resolve the module PWD and resource error while trying "airflow webserver - P 8080"?
worked for me
Hello coder2j, can you please do a lesson on copy data from an API and using it in airflow. For instance, copying sales data and weather forecast data into airflow to train a machine to predict sales over a particular season of the year. It could be the sales of umbrella, a certain type of clothing, cream or even ice-cream.
Wow, thanks for suggesting it. This could be a great ML end-to-end practical courses. Who else want to see this series? Please shout here. :-)
@@coder2j Yes +1
thanku somuch
Nice Tutorial. But i m trying to install airflow in my windows machine. But when i tried to start to webserver its prompting me like there is no 'pwd' module to start the webserver. Could you please explain or provide any reference on how to start airflow webserver in windows boxes.
On Windows you have to use Windows Subsystem Linux to install Airflow.
I have couple questions -
1. How can we change the default DB from sqlite to postgres ? Like execute all the migrations in postgres instead of sqlite.
2. Can we connect to a local postgres instance from Airflow->Admin->Connections ?
Thanks for posting.
1. You can check the part 2 tutorial which uses postgres db as the backend. You can migrate changes from sqlite into postgres if needed.
2. Already working on it. Stay tuned!! 🙌
Amazing, very well explained.
Thanks for your comment Matheus. Nice to hear that you like it!
Thankyou soo much 💓
You are welcome. :-)
Hi great video, however i am stuck at installing Airflow @5.59. Modulenotfound error: no module named wtforms.compat. I have installed the constraints properly and am still receiving the error. Please help
Hi, I am getting this error
The conflict is caused by:
apache-airflow 2.5.2 depends on python-daemon>=3.0.0
The user requested (constraint) python-daemon==2.3.2
To fix this you could try to:
1. loosen the range of package versions you've specified
2. remove package versions to allow pip attempt to solve the dependency conflict
Thanks for sharing!
Is this series for airflow enough for me to understand fundamental
Yes, it is beginners guide. You will get started fast and learn the essential fundamentals.
Many thanks I am at a beginner stage what is the skills required to be a junior data engineer
I am learning the following python sql fundamental of data warehouse course I found udemy and apache spark will that be sufficient for me to secure a junior position
In addition do three projects
In my opinion, junior data engineer need to have the Python coding skills, know how to manipulate data with various sources like SQL server, MySQL, PostgreSQL, ETL platform like Airflow, cloud services experience like AWS, Google Cloud or Azure, and basic software engineering knowledge like how to write clean code, unit test, CI/CD Pipeline. Familiar with Apache Spark, docker would be a plus.
@@coder2j I spoke to several data engineer I was told python,sql and data warehouse is sufficient for junior role as for Cloud, airflow and apache spark is based on company which tool they use I will learn airflow and basic spark using python api
Thanks
Thank you for the course ..
Trying to run it locally, but I'm getting this error message "{webserver_command.py:252} ERROR - [0 / 0] Some workers seem to have died and gunicorn did not restart them as expected" => Please let me know how to bypass this .. thanks
Which OS platform and airflow version? When do you get this error?
same issue here!
exporting the airflow on my end I had to export it this way for it to work globally
export AIRFLOW_HOME=~/airflow
请问大哥,你的executor是local 还是remote ? 具体是哪个? 谢谢
Local executor. 可以查看第二集 docker compose yaml 的设置。ua-cam.com/video/J6azvFhndLg/v-deo.html
iam facin this error when i tried to type "python3 --version"
*Python was not found; run without arguments to install from the Microsoft Store, or disable this shortcut from Settings > Manage App Execution Aliases.*
can you help me in this?
Make sure when you install python, check add python into path button.
why is that airflow.cfg is not in my current directory?
Because you are not setting the AIRFLOW_HOME environment variable right. You need to set it to your current absolute directory.
when I run airflow scheduler, it throws OSError: [Errno 48] Address already in use
It might be that your 8080 port is being used by other process. Stop it and try again.
@@coder2j I'm getting this error frequently
Please advise
The scheduler does not appear to be running. The last heartbeat was received 25 minutes ago.
The DAGs list may not update, and new tasks will not be scheduled.
i get this exception when i run the command " airflow db init" ........ EXCEPTION : File "/c/projects/airflow/setup1/airflow_env/lib/python3.10/site-packages/airflow/www/app.py", line 84, in create_app raise AirflowConfigException(
airflow.exceptions.AirflowConfigException: Cannot use relative path: `sqlite:///./airflow.db` to connect to sqlite. Please use absolute path such as `sqlite:////tmp/airflow.db`.
Just go to the airflow.cfg file, update the sql_alchemy_conn = sqlite:///./airflow.db to absolute path.
Someone had:
ModuleNotFoundError: No module named 'pwd'
after airflow webserver -p 8080?
Yeah I'm getting module PWD and resource error 😢.. Please let me know if you resolve the issue
Airflow is not supported natively on windows. If you are running it on windows, you have to run it on the windows linux subsystem or with docker.
IT Gives me error :-
PS D:\Airflow> source py_env/bin/activate
source : The term 'source' is not recognized as the name of a cmdlet, function, script file, or operable program. Check the spelling of the name, or if a path was
included, verify that the path is correct and try again.
At line:1 char:1
+ source py_env/bin/activate
+ ~~~~~~
+ CategoryInfo : ObjectNotFound: (source:String) [], CommandNotFoundException
+ FullyQualifiedErrorId : CommandNotFoundException
Airflow is not supported natively on windows. You can try it on wsl.
Mine installed without the examples
a me i comandi "airflow db init" e "airflow webserver -p 8080" non vanno quando mi trovo in py_env, mi sai dire perchè? quando lancio "ariflow webserver -p 8080" mi escono questi output:
Traceback (most recent call last):
File "/usr/lib/python3.10/runpy.py", line 196, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/usr/lib/python3.10/runpy.py", line 86, in _run_code
exec(code, run_globals)
File "/home/amb/.local/lib/python3.10/site-packages/gunicorn/__main__.py", line 7, in
run()
File "/home/amb/.local/lib/python3.10/site-packages/gunicorn/app/wsgiapp.py", line 67, in run
WSGIApplication("%(prog)s [OPTIONS] [APP_MODULE]").run()
File "/home/amb/.local/lib/python3.10/site-packages/gunicorn/app/base.py", line 231, in run
super().run()
File "/home/amb/.local/lib/python3.10/site-packages/gunicorn/app/base.py", line 72, in run
Arbiter(self).run()
File "/home/amb/.local/lib/python3.10/site-packages/gunicorn/arbiter.py", line 58, in __init__
self.setup(app)
File "/home/amb/.local/lib/python3.10/site-packages/gunicorn/arbiter.py", line 118, in setup
self.app.wsgi()
File "/home/amb/.local/lib/python3.10/site-packages/gunicorn/app/base.py", line 67, in wsgi
self.callable = self.load()
File "/home/amb/.local/lib/python3.10/site-packages/gunicorn/app/wsgiapp.py", line 58, in load
return self.load_wsgiapp()
File "/home/amb/.local/lib/python3.10/site-packages/gunicorn/app/wsgiapp.py", line 48, in load_wsgiapp
return util.import_app(self.app_uri)
File "/home/amb/.local/lib/python3.10/site-packages/gunicorn/util.py", line 412, in import_app
app = app(*args, **kwargs)
File "/home/amb/.local/lib/python3.10/site-packages/airflow/www/app.py", line 181, in cached_app
app = create_app(config=config, testing=testing)
File "/home/amb/.local/lib/python3.10/site-packages/airflow/www/app.py", line 99, in create_app
raise AirflowConfigException(
airflow.exceptions.AirflowConfigException: Cannot use relative path: `sqlite:///./airflow.db` to connect to sqlite. Please use absolute path such as `sqlite:////tmp/airflow.db`.
Go to your config file, change the SQLite path to an absolute path.
@@coder2j il problema sembra essere il comando export AIRFLOW_HOME=., nel momento in cui lancio il comando AIRFLOW_HOME=~/airflow tutto ritorna a funzionare ma non credo sia la soluzione giusta
can help me ?
i got this when instaling airflow db init
"WARNI [airflow.models.crypto] empty cryptography key - values will not be stored encrypted."
the new folder log, airflow.cfg and airflow.db not creating and I use wsl 2 in windows 10
Make sure you have environmental variable `AIRFLOW_HOME` to the proper directory, otherwise it will be default to your home directory.
@@coder2j i type 'export AIRFLOW_HOME = /' before install apache airflow then airflow db init. can you tell me how to see if AIRFLOW_HOME is in right directory ?
ah the same warning but folder log, airflow.cfg and airflow.db is created
Thanks