Article View: rocksolid.shared.hacking
Article #553Re: linklists to onion sites
From: Anonymous@news.n
Date: Tue, 14 Mar 2023 15:44
Date: Tue, 14 Mar 2023 15:44
8 lines
890 bytes
890 bytes
alright, the spider just crawled more than 10k different onion domains. the jump from 5k to 10k happened real fast now, just after i deleted the ~4,8M links to cp sites from the database, and blocked the cluster of (huge) identical sites under the addresses of http://invest******************.onion guess the index is almost usable now, the only thing to add would be some kind of manual review option (the classification with the keywords works in most cases, but not always). the speed of crawling is still surprisingly slow compared to my expectations. i assumed somehow that with the vast number of onion services estimated to operate already in 2021 would make this exercise faster (the number of onion domains was estimated by the tor team to be ~110k in 2021, using some kind of statistical analysis). but maybe a lot of them are not linked anywhere. -- Posted on Rocksolid Light
Message-ID:
<41b5a689b6f79647d58c468d57858676@news.novabbs.org>
Path:
rocksolid-us.pugleaf.net!archive.newsdeef.eu!archive!apf9.newsdeef.eu!i2pn2.org!.POSTED.novabbs-org!not-for-mail
References:
<f5511e88d3e257b0af3ba43f0075fffe@rocksolidbbs.com> <ttdtda$2li5k$1@dont-email.me> <621efb6656165bdcb3ba10de8d39cb45@news.novabbs.org> <ttf2bd$2rteb$1@dont-email.me> <34d675e4eba6ffe7fce3ce775dc824c0@news.novabbs.org> <14c0a8eb0f4c6641d7cbe8adedc1c93d@rocksolidbbs.com>