We understand: As Googlebot indexes web pages


Hello! Google Search Central has recently launched a new series of publications called "Crawling December", where it shares insights about how Googlebot indexes web pages. Unlike us, people who look at sites when Googlebot visits the web page, it first downloads HTML from the main URL, which can contain links to JavaScript, CSS, Image and Video. Then Google's Web Render (WRS) uses Googlebot to download these resources to create the final page view.
Modern websites are complex because of the extended JavaScript and CSS, which makes them heavier to indexing than old pages, exclusively on HTML.
🚀 A very important point is the management of "Crawl Budget". The fact is that each website uses part of this budget, and if Googlebot spends a lot of time downloading additional resources, it can reduce the "Crawl Budget" of the main website. This is where Google uses a cacking strategy that helps you save the Crawl Budget. The WRS cache lasts up to 30 days and does not depend on the HTTP-Rights of the caches installed by the developers.
- 📌 Resources can significantly affect your site scan budget, so it is important to understand how Googlebot processes these resources.
- 📌 To block important resources in Robots.txt may be risky. If Google is unable to access the required resource for rendering, it can affect the rating and content of the page.
- 📌 Understanding these mechanic will help SEO specialists and developers make the best decisions about placing resources and accessibility - elections that directly affect how well Google can scan and index their sites.
Frequent questions:
1. What is Googlebot?
GoogleBot is a Google web work that scans new and updated web pages to add to the Google Index.
2. What is "Crawl Budget"?
"Crawl Budget" is the number of pages on the site that GoogleBot can and wants to index over a period of time.
3. How does Robots.txt affect the indexation process?
Robots.txt file indicates Googlebot what pages or files it should or should not visit on your site.
4. What is Google's Web Render (WRS)?
WRS is a system that Google uses for web pages rendering, just as the browser does.
Статтю згенеровано з використанням ШІ на основі зазначеного матеріалу, відредаговано та перевірено автором вручну для точності та корисності.
https://www.searchenginejournal.com/google-host-resources-on-different-hostname-to-save-crawl-budget/534317/