March 30th, 2010 by Susie

Google and the other search engines evaluate websites using web crawlers, also called spiders or bots. These are fully automated critters that follow links across the internet independent of their owners, and report on what they find. That information is used to measure relevancy to particular searches and also to rank websites and decide which are the best. There are a slew of other factors at work, of course- SEO is a complex business- but the data gathered by crawlers is extremely important.

Sites with high authority, those that the search engines think are good, are crawled (or ‘indexed’) frequently. If they are known to update content frequently, that could mean a few times a day. New and unknown sites, or those with low authority for whatever reason, won’t be crawled so often. Poorer quality sites may only be indexed once a month or less.

Collecting and storing data with bots is cheap, but when that data runs into terabytes upon terabytes, using it effectively does become more problematic. Of course, there is incredibly valuable information in Google’s data warehouses, but nobody can deal with an infinite amount of data. There are millions and millions of sites on the web, some with a lot of content and many different subpages, and search engines need to prioritise the ones they gather data from.

For the same reasons, crawlers limit the information they gather from each site. And like the frequency of crawl, the amount of information gathered varies according to the good standing of the site, or otherwise. The higher a site’s authority, the more the crawler will look at.

For most sites, the bots restrict themselves to the top four levels of the url. That means thissite.com/level2/level3/level4/apage.html won’t be considered. Any keywords or content on it won’t contribute relevancy information. Users don’t like clicking through a lot of levels either, so keeping your site structure at four levels or less is a sound idea for more than one search engine optimization reason.

They also don’t index more than about 150kB of content from any page or subpage. Images don’t count towards the total, so you do get quite a lot of text within the limit, and all of that will contribute towards your overall SEO efforts. Again, most users won’t read through nearly that much content on any one page either, so there is a second reason to keep each one at a reasonable size. You have to consider your search engine reputation management with every aspect of your site.

Titles should be no more than 70 characters in length or there abouts. That’s not far off the length of the last sentence, so as you can see, it’s a fairly generous allowance. Anything longer than that will look a little odd anyway.

There are other factors that limit where bots will look- Flash objects and poor or image based navigation, for example- but as a rule of thumb, create content for easy reading by people and you probably won’t have to worry much about indexation limits. It is a good policy that will serve your SEO well in a lot of areas.

This entry was posted on Tuesday, March 30th, 2010 at 3:44 pm. You can follow any responses to this entry through the RSS feed.

Link to us

If you want to link to this blog, copy and paste the following HTML code to your website.

Leave a Reply

Latest Articles more >

Why Select a UK SEO Company?

The net is an international phenomenon. Furthermore, it has been a major player in trends towards economic and cultural globalisation. It is hard therefore for some individuals to comprehend why choosing UK search engine optimization can make a lot of commercial sense for many firms which are primar...more.

Posted on 12/30 at 16:13

Reflections on Homepages

A search engine optimization service spends plenty of time getting a target audience to a site. However, if the relevant techniques are not used to address any issues on the pages of the site all the effort can come to very little. One of the main things which a SEO service must get correct is the h...more.

Posted on 12/23 at 14:08

Links that Connect with Trouble

UK SEO services are a mixed bunch. Some of them are ethical operations who usually obtain impressive results for their clients. However, some UK SEO marketing firms are either lacking in competence or practice black hat techniques which are unethical and flirt with danger. It is thus very important ...more.

Posted on 12/16 at 11:04

5 Common SEO Errors

A SEO service has to ensure that it does not make basic errors if it is to prosper. The complexity of search engine optimization means that this is not always easy. The difficulty involved in pursuing a campaign means that the average site owner should not attempt to perform optimization in the abse...more.

Posted on 12/09 at 11:00

3 SEO Virtues

While site owners have to be aware that SEO services have to accomplish technical tasks, they should also be conscious of the fact that certain virtues are essential when performing optimization. Some unethical firms either do not know or do not practice these virtues. The right information can help...more.

Posted on 12/02 at 16:14

Choosing SEO Tools — Don't Fall for a Scam

The vast majority of articles out there on the web about SEO tools are written by SEO marketers. It's the unfortunate truth of the industry. While there's plenty of helpful advice, the motives behind it can sometimes lead to site owners wasting their valuable time. This is why it's important to chec...more.

Posted on 11/24 at 15:39

Search

Blog Categories

Latest Posts

Archive

Authors

Chat Button

Client Login

Latest Blogs more >

Want to set yourself apart from all the other noise in social media? One way to attract attention is with a widget. Constructing a small but useful piece of software that your target audience can easily access is a classic way to draw attention on the internet, and it can work for social media, too. Let's face it, competition for attention on social media networks ... more

0 Comments

It is amazing how much trouble unguarded comments can cause for a firm. Some examples, too embarrassing to repeat, have entered into corporate folklore. Reputation management has always been in part about preventing mistakes of this type. In addition, it has frequently been necessary to minimise any damage which has resulted from clumsiness. The advent and evolution o... more

0 Comments

In 2011, SEO had a strong content focus, and it's not looking like 2012 is going to be any different. Google has gone strong and hard on changes to the algorithm to improve content quality in listed results. For any site owner serious about rising in the rankings, content must remain central — even when building links. Even as recently as a year and a half ag... more

0 Comments

Not many site owners approach SEO thinking that they'll have to become a media influence. Unfortunately, if you want your site to get attention, that's exactly what you have to do. As your SEO company will tell you, top rankings these days depend as much on off-site interaction as they do on on-site technical work. If you're having trouble finding an angle that will w... more

0 Comments

Signup to Our Newsletter