I tried this with a real-life dataset with around 5 million fact rows and 200,000 items belonging to 25 brands. Dataset is in import mode and file size is around 200 MB. When grouping by the low cardinality brand attribute, the pareto calculations work really well. But when applying them to the individual articles in a matrix visual, then the calculation runs for around 10 minutes and then runs out of memory, although the basic Sales calculation by items works well and finishes in around 2.5 seconds. Is there a recommended data limit for using your approach or any idea to get it faster and especially less memory consuming?
Yes, I have a star schema. The items with their brand attribute are in their own dimension table and relationship is one to many with full referencial integrity. I'm using the calculation groups approach a you showed in the video.
I just published to a premium per user workspace to make sure it's not my laptop. But there as well, error is out of resources. Applying the pareto calculations to the brand level works amazingly well also in the service.
I tested it on 20K products and it worked fine. I will try to get a larger dimension and test it against that. Maybe WINDOW function approach is not the right one when working with large cardinality and have to look for alternate solution. Opportunity to explore it further and do another video to make it work with larger cardinality. I will share my findings when I had a chance to work on it. Thanks for sharing your feedback and the input.
I tried this with a real-life dataset with around 5 million fact rows and 200,000 items belonging to 25 brands. Dataset is in import mode and file size is around 200 MB. When grouping by the low cardinality brand attribute, the pareto calculations work really well. But when applying them to the individual articles in a matrix visual, then the calculation runs for around 10 minutes and then runs out of memory, although the basic Sales calculation by items works well and finishes in around 2.5 seconds. Is there a recommended data limit for using your approach or any idea to get it faster and especially less memory consuming?
QQ, are you using calculation groups to make it dynamic or you just trying to use it for each article?
I'm sure you have a star schema with product as a dimension, correct?
Yes, I have a star schema. The items with their brand attribute are in their own dimension table and relationship is one to many with full referencial integrity. I'm using the calculation groups approach a you showed in the video.
I just published to a premium per user workspace to make sure it's not my laptop. But there as well, error is out of resources. Applying the pareto calculations to the brand level works amazingly well also in the service.
I tested it on 20K products and it worked fine. I will try to get a larger dimension and test it against that. Maybe WINDOW function approach is not the right one when working with large cardinality and have to look for alternate solution. Opportunity to explore it further and do another video to make it work with larger cardinality. I will share my findings when I had a chance to work on it.
Thanks for sharing your feedback and the input.
Excellent! Thank you so much
Glad it was helpful! Thanks for the comment. Best regards!
great video please send .pbix for works
Thanks for the feedback. I will put the link to the pbix files in the description of the video and let you know when it is there. Thanks!
thanks