6
Pro tip: a senior data engineer told me "just cache the intermediate table" back in 2016, I ignored it for 3 years
Finally tried it on my airflow pipeline last month and cut runtime from 45 minutes to 12, has anyone else resisted a simple trick for way too long?
3 comments
Log in to join the discussion
Log In3 Comments
wendy13128d ago
Cannot believe you sat on that for 3 whole years. That's insane. 45 minutes down to 12 just from one simple cache step? That's a massive difference. I bet you were kicking yourself pretty hard after finally trying it. Makes you wonder how many other simple tricks we all ignore because they sound too basic to actually work. Pretty eye opening honestly.
9
wood.faith28d ago
45 minutes down to 12" is actually wild to think about. That's like cutting 70% of your time just from one dumb little tweak. I honestly would've been so mad at myself for waiting that long, like how do you even explain that to someone. Makes you wonder what other basic stuff we're all ignoring because it sounds too simple to matter.
4
henryp4028d ago
The "70%" thing is a bit off, it's more like 73% if you do the math exactly. Still a huge cut either way though.
3