My latest weekend project. Mixing the scalability of SQL and with the easy of use of spreadsheets
Вставка
- Опубліковано 10 лют 2025
- I've been processing a lot of local CSVs lately. Often feeling like the tools I'm using don't behave quite how I would like them to.
I built a prototype of an application remedy these frustrations. And also explore some existing solutions to fix similar problem using tools like Postgres, SQLite, HTTPie, Bash and JQ.
oh no my guy built access with postgres...
I usually use datagrip, it supports editing sql tables like a spreadsheet.
I usually use dataframes for local data processing. I convert CSVs and SQLite tables into dataframes, calculate/query something and convert the dataframe back to CSV or SQLite. In Julia the packages used for this are DataFrames.jl, CSV.jl and SQLite.jl, but other languages also have dataframe implementations, e.g. python (pandas data.frame), R (dplyr data.table) and Kotlin (dataframe).
Also Polars data frames for Python/Rust!
There's a really nice project called Teable that does exactly this.
You can use datagrip or table plus
I've definitely run into this type of thing before when I was working as a data analyst. I ended up making a custom python desktop GUI that could import several data formats (e.g. excel, csv, dbf) into a local SQLite database for easier analysis and then a table UI to see/query the data.
Not sure if you had already considered/used this, but if you end up working with postgres, its COPY command would be super helpful (similar to the SQLite feature you mentioned). As long as your data format is transformable into a CSV stream, you could bulk copy any data source into your local postgres database using most postgres clients. The COPY command also allows you to copy out data from a query as a CSV stream.
Well, you reinvented the wheel. You essentially need to look at headless CMS technology such as directus or strapi. It's literally just an UI on-top of a database and they offer features like REST/GraphQL API and authentication and permission layer. To catch your final use-case (Excel) you could write an Excel plugin that connects to a directus/strapi instance and go from there.
Did you comment this before you even got 2 minutes into the video where they say they must be reinventing the wheel
@@griffadev I started writing the comment before that point yes. I still decided to post it, even if repeating his own words, because I mention what he re-invented. Might be useful for him and others to know what he re-invented. These are great tools to archive a lot what he is going after in this video, so why not share it?
@@_modiXI find the urge to comment before even getting to half of the video a bit funny, but your heart was in the right place lol.
Thanks for the tip.
You described Microsoft access. It is a user friendly interface that can use its built in JET database engine or use it as a front end for mssql server or any ODBC accessible database.
Yeah this is basically what I was thinking the entire time, I wrote some shitty access DB in 2017 and it's STILL in use today, and is scheduled to be converted to run from a SQL backend in Azure.
Access is so unloved by most, but I'm its number 1 dumbass fan.
A cultured F1 fan I see.
Feedback: Audio feels really soft for this video compared to your other ones
You are correct, I didn't to compress this one. Good to know it makes a big difference! Thank you
Python + pandas in a Jupyter notebook handles this type of stuff extremely well. It definitely isn't the same experience as a spreadsheet, but it is generally much more powerful IMO.
You need to reformulate the PROBLEM. How would you handle relations between tables and NORMALIZATION ? Oracle has something called EXTERNAL TABLES. Basically, csv files as tables.
Yeah, look into datagrids such as ag-grid. These projects incorporate already probably most of the features you will ever need for the presented use-cases. I also recommend people that start businesses on a spreadsheet (a dietary recommendation one pops to mind) to convert it to one of these datagrids.
For the 'more convenient interface on top of the database' part, there are already lots of projects geared towards that specifically, ranging from admin panels like the Django one, to things like, surprise, exactly what you tried to present in the video: Airtable, "a spreadsheet-database hybrid, with the features of a database but applied to a spreadsheet" (lifted directly from wikipedia)
I'm trying to replace a abomination of a multi file and multi user excel macro monster accessed via RDP with data structures that are more graph like then relational.
EVERYTHING is done in excel... The UI, the reports, the shipping label printing and the entire business logic.
Guess how much fun I have. I've tried over the years multiple approaches to slay the beast but it constantly finds new ways to elude.
Currently I'm learning to write a Interpreter so I can write something capable of parsing all the relations between the VBA, formulas and cells spanning over the multiple files.
2:06 That flow is describing most web apps too!
My dad built something exactly like this for the financial interface for the lottery system owned and distributed by Tabcorp
NOTE: If exporting to flat text file then use TSV (tab separated values) not CSV... you will save yourself a world of non-standardised escaping pain.
TSV is achieved on Excel by "save as csv" then choosing tab as the separator... because MS UIs Imports likewise require an extra couple of clicks, but it beats the occasional record with transposed fields.
100% the case for DuckDB
I was looking for this comment. DuckDB is really good.
Very interesting video! I'm not personally aware of any tools specifically like this. I do have one nitpicky bit of criticism, though. I think you meant, "ease of use" in the title instead of "easy of use".
Ever considered a db gui? 😅
Really cool
Isn't this just like nocodb or bsserow?
Nocodb yes, Baserow no. The latter doesn't give you the Postgres access option.
You basically built Supabase :D Cool video tho :)
Duckdb instead of Postgres?
Python does this with ease. A shame the other languages doesn't offer musch for basic data analysis.
No, the point is having both GUI access and programmatic access to the same data, so you get the advantages of both of those access modes. The Python ecosystem doesn't offer that experience.
I'm just much more familiar with python programming, so I do pretty much anything you do here in pandas
You know that Grist exists and is self-hostable?
Also a grist user here and it's an amazing project. I wish it could support a postgresql backend and use it larger projects where data is also worked on outside of grist. But the features, user efficiency and project design is really wonderful for small to medium projects with less dependencies.
One problem with Grist is that relations are not DB foreign keys, but some Grist specific thing, so you can't really use SQL.
have you looked into dash ag grid?
Baserow
Jupiter Notebooks
Pocketbase
AirTable
You don't get the benefits of Postgres queryability.
@@samuelswatsonif that’s the goal surely a search engine (elk) and postgresql could be used together for the best querying (however not the easiest for analytics). Xata are doing this, I would highly recommend them
duckdb
cant understand. pls speak clearly