Appreciate the deep dive and planning to also do my own hacking around with it. Do you know how it works on larger volume/scale workloads that typically need more than a single DuckDB node?
That would be the dream! The idea to not actually worry about the compute and just know it'll be automatically handled (subject to some cost) would be amazing. My intuition is that Iceberg makes the most sense for the largest data workloads but then those are the ones that need the distributed compute. But I also come from an adtech world so biased there.
Appreciate the deep dive and planning to also do my own hacking around with it. Do you know how it works on larger volume/scale workloads that typically need more than a single DuckDB node?
As far as I know, every function runs on a single node with automatic vertical scaling. Not sure if it's their plan to implement distributed computing
That would be the dream! The idea to not actually worry about the compute and just know it'll be automatically handled (subject to some cost) would be amazing. My intuition is that Iceberg makes the most sense for the largest data workloads but then those are the ones that need the distributed compute. But I also come from an adtech world so biased there.