This is my first try at creating a map of lemmy. I based it on the overlap of commentors that visited certain communities.

I only used communities that were on the top 35 active instances for the past month and limited the comments to go back to a maximum of August 1 2024 (sometimes shorter if I got an invalid response.)

I scaled it so it was based on percentage of comments made by a commentor in that community.

Here is the code for the crawler and data that was used to make the map:

https://codeberg.org/danterious/Lemmy_map

  • Danterious@lemmy.dbzer0.comOP
    link
    fedilink
    English
    arrow-up
    0
    arrow-down
    1
    ·
    2 months ago

    I had to try scraping the websites multiple times because of stupid bugs I put in the code beforehand, so I might of put more strain on the instances than I meant too. If I did this again it would hopefully be much less tolling on the servers.

    As for the cost of scraping it actually isn’t that hard I just had it running in the background most of the time.

    Anti Commercial-AI license (CC BY-NC-SA 4.0)