Formerly u/CanadaPlus101 on Reddit.

  • 71 Posts
  • 5.15K Comments
Joined 1 year ago
cake
Cake day: June 12th, 2023

help-circle





  • Ah, looks like you beat my edit by a few seconds.

    Good to know about the Netscape thing. It looks like Firefox (still, being a successor to NS) does it that way, and Chrome can do it that way. If you’re using a true third option you probably don’t need my help.

    For the sake of completeness, on Tor Browser you have to copy the SQLite database from the browser directory, since it’s too locked down to just export the normal way. Then I’d try just subbing it in on an offline Firefox instance and proceeding the normal way. And obviously, use wget over torsocks as well.


  • I find that the things most likely to disappear (like a tinkerer’s web 1.0 homepage) tend to have limited recursion depth anyway.

    A Tumblr blog takes an awfully long time to crawl politely, IIRC, but the end result wasn’t too big on disk. Now I’m wondering how you would pass a cookie to wget, and how you might set a data cap so you can stop and wait for the month to be up before you call it again. I kind of feel like I’ve done a cookie before to get around a captcha or something…

    Edit: There’s a couple of ideas for limiting size on StackOverflow. The wget specific one is -Q for quota, which you’d want to set conservatively in case there’s one huge file somewhere, since it only checks between individual downloads.

    Looks like there’s a --load-cookies option that will read a browser export of cookies from a file, as well as load POST data and save cookie options if you want to do something interactive that way.

    Edit edit: What I’m remembering is actually adding headers, like this.







  • CanadaPlus@lemmy.sdf.orgtoProgrammer Humor@lemmy.mlDesigners cry quietly
    link
    fedilink
    arrow-up
    5
    arrow-down
    1
    ·
    edit-2
    4 days ago

    Uhh, so looking carefully at the picture, it appears they shouldn’t have bothered with the inner pathway at all, and should have just connected the bridge over the canal (?) in the background to whatever is under the camera.

    Not only does the current design fail to provide a short path in demand, it leaves a goofy little boulevard behind the benches in what appears to be a dense, desirable urban area where you shouldn’t waste space.