Google Search is losing its 'cached' web page feature

d3Xt3r@lemmy.nz · 10 months ago

Google Search is losing its 'cached' web page feature

modus@lemmy.world · 10 months ago

Isn’t caching how anti-paywall sites like 12ft.io work?

megaman@discuss.tchncs.de · 10 months ago

At least some of these tools change their “user agent” to be whatever google’s crawler is.

When you browse in, say, Firefox, one of the headers that firefox sends to the website is “I am using Firefox” which might affect how the website should display to you or let the admin knkw they need firefox compatibility (or be used to fingerprint you…).

You can just lie on that, though. Some privacy tools will change it to Chrome, since that’s the most common.

Or, you say “i am the google web crawler”, which they let past the paywall so it can be added to google.

sfgifz@lemmy.world · edit-2 10 months ago

Or, you say “i am the google web crawler”, which they let past the paywall so it can be added to google.

If I’m not wrong, Google has a set range of IP addresses for their crawlers, so not all sites will let you through just because your UA claims to be Googlebot

lud@lemm.ee · 10 months ago

I dunno, but I suspect that they aren’t using Google’s cache if that’s the case.

My guess is that the site uses its own scrapper that acts like a search engine and because websites want to be seen to search engines they allow them to see everything. This is just my guess, so it might very well be completely wrong.