People often ask why I contribute to open source projects or otherwise work on building automated tooling. They see me spending hours to automate a task or fix a bug that take seconds to do or avoid manually, in a way that the original XKCD comic says won’t pay off. The disconnect seems to be that the comic and those people only consider time it saves me, not time it saves the tens to thousands to millions of other people who will use the script or patch or whatever when I publish it. So, here’s a version of xkcd.com/1205 updated for making decisions that benefit a thousand people instead of just one.

  • jas0n@lemmy.world
    cake
    link
    fedilink
    English
    arrow-up
    16
    arrow-down
    1
    ·
    5 months ago

    I used to write tons of automation in my previous data role. While time saved matters, the other important takeaway is reproducibility. Other people on the team were writing giant SQL scripts and highlight running each one and then manual checking to see if it worked… I’m talking about tables anywhere from 1-100 millions records. You aren’t checking shit by skimming a top 1000. And what a ridiculously error prone process that is. Take the human out of that equation!

    If the data came out wrong, it would be because the data came in different/corrupted, not because I missed a query. Speaking of different causing problems… one time a company sent us data that was fixed width by character instead of fixed width by byte. Smh…

    • diffusive@lemmy.world
      link
      fedilink
      English
      arrow-up
      6
      ·
      5 months ago

      This! The point of automation is rarely saving time. The point of automation is increasing quality.

      It can be a data quality, it can be mitigating a production risk, can be avoiding regression.

      Heck even unit tests are automation (you may just manually test your code once and call the day).

      I am not saying that automation is always good, but the evaluation should be

      1. what is the cost of production/data quality/regression gone wild? (Possibly in€/$/¥)
      2. what is the cost of the person/team performing the task over 1 Year (Again, £€$¥)
      3. what is the expected cost of the person/team implementing automation?

      Then you do (3)*3 - (1) *3 - (2). Is it positive? You do, is it negative you? You don’t. The more it’s positive the higher the priority of doing.

      Why the *3? The first because the expected cost of automation is always massively underestimated The second because it takes multiple times something goes wrong till the decision is reconsidered 🙂

      Why 1 year? Because generally the task to automatize changes or disappear