Google Now Uses RSS/Atom Feeds to Discover New URLs

November 2, 2009 by Allan · Leave a Comment
Filed under: Google 

Google Webmaster Central has blogged that they are now using RSS/Atom to discover new URLs or webpages.  In short Google is now indexing and crawling not only contents found on websites and blogs but contents that are syndicated from them through RSS/Atom feeds. Whereas before, Google relies mainly on the links provided on website and blog contents, now your site can be found by Google through RSS/Atom feeds that are published on online RSS Feed reader, such as Google Reader.

To those who still don’t know what an RSS/Atom is.

RSS (most commonly translated as “Really Simple Syndication” but sometimes “Rich Site Summary“) is a family of web feed formats used to publish frequently updated works—such as blog entries, news headlines, audio, and video—in a standardized format. An RSS document (which is called a “feed”, “web feed”,or “channel”) includes full or summarized text, plus metadata such as publishing dates and authorship.

Web feeds benefit publishers by letting them syndicate content automatically. They benefit readers who want to subscribe to timely updates from favored websites or to aggregate feeds from many sites into one place. RSS feeds can be read using software called an “RSS reader”, “feed reader”, or “aggregator”, which can be web-based, desktop-based, or mobile-device-based. A standardized XML file format allows the information to be published once and viewed by many different programs.

The user subscribes to a feed by entering into the reader the feed’s URI or by clicking an RSS icon in a web browser that initiates the subscription process. The RSS reader checks the user’s subscribed feeds regularly for new work, downloads any updates that it finds, and provides a user interface to monitor and read the feeds.

So in order for Google to discover your webpages through  RSS/Atom, it is important that you allow crawling of robots.txt

To find out if Googlebot can crawl your feeds and find your pages as fast as possible, check your Webmaster Tools account.  The Test robots.txt tool will show you if your robots.txt file is blocking Googlebot from a file or directory on your site. The toolbox can be find on Crawler Access under Site Configuration.

Random Posts

Speak Your Mind

Tell us what you're thinking...
and oh, if you want a pic to show with your comment, go get a gravatar!


You must be logged in to post a comment.