Robots.txt: How to Show Your Best Side to Google

Services
Technology
About Us
Our Work
Insights

Robots.txt: How to Show Your Best Side to Google

October 30, 2014

mwalshe

3 min

Recently Google changed the way they view sites. In the past search engines tended to see a website much as a user using a text-only browser would. This has since changed. Now search engines have switched to looking at sites as a human would a modern web browser; rich with imagery, video content and a variety of other media.

This can cause problems if a sites robots.txt accidently blocks JavaScript, CSS and images from being crawled. This can happen when a CMS (content management system) has a default robots.txt that blocks the key files that are required to display a page from Google’s crawler’s gaze. If you do this Google have gone on record as saying:

Disallowing crawling of JavaScript or CSS files in your site’s robots.txt directly harms how well our algorithms render and index your content and can result in suboptimal rankings.”

This is a major change and indicates that how a site looks to a human user now factors as a ranking metric. After all Google want to provide the end user the best possible sites. And in this day and age a text-based site simply wont do. We have come to expect imagery, video and a host of other points of interaction. A Robots.txt file is a powerful tool, but the downside is that even a minor error in the robots.txt file can cause major disruption. We have seen sites block so much of their JavaScript, images and CSS that they appear to Google as a text-only website. In the past we have seen sites accidentally block themselves entirely, which had a massive impact on the traffic to their site.

Stay Forward

Subscribe to our monthly newsletter.

Tips for checking your robots.txt

Google now provide tools as part of Google Webmaster Tools that allow you to see the results of a Google crawl and if any elements on a page are blocked. You or your developers should not just use a default robots.txt for a particular CMS without checking each line to ensure that it’s required or if additional lines need to be added. Make sure that every site has a robots.txt even if it is empty. Sometimes when Google’s crawler bots can’t find a robots.txt file it can assume that the entire site no longer exists and you could take a substantial rankings hit and ultimately lose traffic. Another problem that is often encountered with robots.txt files comes during the launch of a new site. Developers put robots.txt files to block all crawlers during the development phase. This is done so as not to allow a half built site to rank. Forgetting to remove these is a problem I see more often than I would like to. If you would like any further clarification on this issue do not hesitate to get in contact with a member of the team. We’d be happy to help.

Contributing Experts

Mwalshe

Mentioned in this article

Web Analytics

Unleash the true value of your data with our center of excellence, Proove Intelligence.

Learn More

Data Science

Unleash the true value of your data with our center of excellence, Proove Intelligence.

Learn More

Business Intelligence

Unleash the true value of your data with our center of excellence, Proove Intelligence.

Learn More

Explore more insights

BLOG

Incrementality in analytics: Breakthrough strategy or marketing jargon?

April 07, 2025

Clyde Correa

8 min

BLOG

CRO in 2025: How AI, personalisation, and micro-conversions are changing the game

March 10, 2025

Mario Lyn

6 min

BLOG

Data-driven decision making: How AI analytics tools improve marketing strategies

May 27, 2024

Kuhan Puvanesasingham

4 min

Stay Forward

Subscribe to our monthly newsletter for the latest insights and tools sent straight to your inbox.

Thanks for Subscribing!

Your journey forward starts here—exciting
insights and tools are on the way.

Stay Forward

Subscribe to our monthly newsletter.