[tor-dev] Atlas is not that friendly to Web Archive
Leonid Evdokimov
leon at darkk.net.ru
Tue Feb 13 14:33:29 UTC 2018
Hello!
I've recently found out that new Atlas re-design is not that friendly to
web archive. http://archive.li/ can't properly detect "page loaded"
event that leads to capturing "loading" page[%]. Moreover,
https://web.archive.org/ can't capture #-based links at all, as far as I see.
[%] https://archive.li/https://atlas.torproject.org/%23details/5C3B8FB35A13C508CF65E8499E35755DA098DC93
Ability to archive atlas pages is kinda nice to be able to "cite" some
relay status in some specific date as Atlas has no it's own time machine
and information about relay is purged in a few days after relay going down.
https://archive.li/RzGpJ is better than https://archive.li/JGQRW :-)
I'm not a skilled frontend developer, but maybe trading some Time-to-DOM
making JS loading and onionoo.tpo request synchronous should be
enough to make website friendly for that sort of crawlers... But it's
unclear to me if T2DOM is valuable KPI for Atlas or not :)
What do you think?
--
WBRBW, Leonid Evdokimov, xmpp:leon at darkk.net.ru http://darkk.net.ru tel:+79816800702
PGP: 6691 DE6B 4CCD C1C1 76A0 0D4A E1F2 A980 7F50 FAB2
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 819 bytes
Desc: not available
URL: <http://lists.torproject.org/pipermail/tor-dev/attachments/20180213/957c22d9/attachment.sig>
More information about the tor-dev
mailing list