Argument Parsing with argparse in Python

This INCREDIBLE trick will speed up your data processes.

Garbage Collection in Python: Speed Up Your Code

НЕ ПОКУПАЙ iPhone 17 Air!

«Машина з такою швидкістю летіла, і такий гул, я думала, що це ракета летить» #shortsvideo #дтп

ЧТО ЖЕ МЫ КУПИЛИ СОБАКЕ ВМЕСТО ТАБАЛАПОК😱#shorts

Speed Up Data Processing with Apache Parquet in Python

NeuralNine

Переглядів 9 440

Додати в
- Мій плейлист
- Переглянути пізніше
Поділитися

Поділитися

Вставка

Розмір відео:

Показувати елементи керування програвачем

Автоматичне відтворення

Автоповтор

Опубліковано 2 гру 2024

КОМЕНТАРІ • 18

@chndrl5649 Рік тому ⁺⁴
The reason why the memory taken for both dataframe is because of the datatypes. Csv will convert most predefined datatypes into string which is much larger than numeric datatypes
@islam9212 Рік тому ⁺⁸
It hurt my eyes when I saw the calculator even though a python console exists. For a future video it would be interesting to include a comparison with the pickle, feather and jay formats.
@tb9359 Рік тому ⁺³
Had never heard of Parquet. Thank you. It looks very useful.
@jeremiahhauser7148 Рік тому ⁺²
Interesting, but I am not convinced. If I got it correctly, when selecting columns the time went down by a factor of 3 for both methods (4->1.3s and 0.24->0.08s). So parquet is better anyway, but whether it is specifically better for column-wise access still needs to be demonstrated.
As the other commenter, I would also be interested in a broader comparison with other formats.
Great channel, keep up the good work.
@JeremyLangdon1 Рік тому
I think pandas tried to infer data types from CSV and often defaults to string. This takes much more space and CPU. Parquet has data types built in to the file so pandas does not need to infer anything. What would be more interesting is when reading the CSV, specify the data types to make it a more “even” comparison.
@dana-pw3us 9 місяців тому ⁺¹
Why not compare sizes of files on a disk? Are they different?
@Gabriel-cf3bw Рік тому
Nice tutorial! Very introductory!
@multitaskprueba1 7 місяців тому
You are a genius! Fantastic video! Thanks!
@slothner943 Рік тому
Usually go for feather format. Never understood the difference - just that for me and the data im handling (few columns) feather seems to be quicker.
@JLSXMK8 Рік тому ⁺¹
I have a related question: Since parquet files are "column-oriented", do you think they would be a good way to store database backups?
Example scenario: Let's say you want to store a database backup, assuming that the data in the database is in a stable state; it contains a large number of product records; maybe their IDs, descriptions, how many purchases for a product, the product prices, etc. Would it be a good idea to store a backup of this database using a parquet file since the backups would be faster to load in case of the data becoming unstable via a transaction in the future? You could rollback the transactions too; however, what if too many of them fail, and all of them need to be rolled back?
@KingOfAllJackals Рік тому
Parquet isn’t a generic file format. It IS a table so you’re not “store backups” in a Parquet file. I guess you could backup each table independently but nearly every real DB has much more efficient and powerful native backup infrastructure.
Parquet however is where a lot of transactional data ends up for analytics. Columnar storage is more suited to large analytic workloads. Row stores are more suited for OLTP workloads. You would never want to use Parquet for things like “deduct $7.83 from customer 1234’s checking account”.
@JLSXMK8 Рік тому
@@KingOfAllJackals That is exactly what I thought of possibly using it for; I could use it to back up tables in the database. You did interpret that correctly. I would NOT edit the contents of the parquet backups.
@farshidzamanirad9691 Рік тому
Awesome!
@N0rberK Рік тому
Tnx Capt.
@julianreichelt1719 Рік тому
nice
@codewithmajid4841 Рік тому ⁺²
I am Junior data scientist From Pakistan
@codewithmajid4841 Рік тому
ok Boss

Наступне

Автоматичне відтворення

Argument Parsing with argparse in Python

Argument Parsing with argparse in Python

This INCREDIBLE trick will speed up your data processes.

This INCREDIBLE trick will speed up your data processes.

Garbage Collection in Python: Speed Up Your Code

Garbage Collection in Python: Speed Up Your Code

НЕ ПОКУПАЙ iPhone 17 Air!

НЕ ПОКУПАЙ iPhone 17 Air!

«Машина з такою швидкістю летіла, і такий гул, я думала, що це ракета летить» #shortsvideo #дтп

«Машина з такою швидкістю летіла, і такий гул, я думала, що це ракета летить» #shortsvideo #дтп

ЧТО ЖЕ МЫ КУПИЛИ СОБАКЕ ВМЕСТО ТАБАЛАПОК😱#shorts

ЧТО ЖЕ МЫ КУПИЛИ СОБАКЕ ВМЕСТО ТАБАЛАПОК😱#shorts

МАМАША, Когда обидели Ребёнка (смешное видео, юмор, приколы, поржать)

МАМАША, Когда обидели Ребёнка (смешное видео, юмор, приколы, поржать)

What is Apache Parquet file?

What is Apache Parquet file?

Transfer Complex Python Objects via Sockets

Transfer Complex Python Objects via Sockets

Apache Arrow Explained by Voltron Data's Matt Topol - Subsurface

Apache Arrow Explained by Voltron Data's Matt Topol - Subsurface

This Is Why Python Data Classes Are Awesome

This Is Why Python Data Classes Are Awesome

Test-Driven Development in Python: Test First Code Later

Test-Driven Development in Python: Test First Code Later

Python Polars - Fastest Data Science Library!

Python Polars - Fastest Data Science Library!

The columnar roadmap: Apache Parquet and Apache Arrow

The columnar roadmap: Apache Parquet and Apache Arrow

VSCode Features Python Devs NEED To Know

VSCode Features Python Devs NEED To Know

shutil: The Ultimate Python File Management Toolkit

shutil: The Ultimate Python File Management Toolkit

НЕ ПОКУПАЙ iPhone 17 Air!

НЕ ПОКУПАЙ iPhone 17 Air!

BD556+ Smoke Silencer.Who needs this for Christmas? #toys #gelblasters #gelblasterguns #airsoft

BD556+ Smoke Silencer.Who needs this for Christmas? #toys #gelblasters #gelblasterguns #airsoft

Players push long pins through a cardboard box attempting to pop the balloon!

Players push long pins through a cardboard box attempting to pop the balloon!

«Їли жом, багато хто від нього помер» - як люди виживали під час Голодомору #shorts

«Їли жом, багато хто від нього помер» — як люди виживали під час Голодомору #shorts

coco在求救？ #小丑 #天使 #shorts

coco在求救？ #小丑 #天使 #shorts

МАМАША, Когда обидели Ребёнка (смешное видео, юмор, приколы, поржать)

МАМАША, Когда обидели Ребёнка (смешное видео, юмор, приколы, поржать)

😱 БЕЗУМЦЫ! РФ впервые АТАКОВАЛА Украину МЕЖКОНТИНЕНТАЛЬНОЙ баллистической ракетой #shorts

😱 БЕЗУМЦЫ! РФ впервые АТАКОВАЛА Украину МЕЖКОНТИНЕНТАЛЬНОЙ баллистической ракетой #shorts

3 Дня как Бомж! Масленников, Сабина, Даник живут на помойке

3 Дня как Бомж! Масленников, Сабина, Даник живут на помойке