

768·
19 days agoThat jumped out at me too. Giving the benefit of the doubt, it could be that this “snapshot” includes a very large amount of data that could be problematic if stored locally for longer. In reality, they probably do it this way for exactly this type of situation, so they can retain full control of the potentially-damning data.
That seems almost maliciously stupid. We need to train a new model. Hey, where’d the data go? Oh well, let’s just go scrape it all again. Wait, did we already scrape this site? No idea, let’s scrape it again just to be sure.