Get a Chance at Google with Aggiorno

by Federico Zoufaly 29. July 2008 05:14

As web users and web developers we are constantly attributing human qualities to the different actors of the Internet.  In my mind Google has always been the sexy, out of reach girl that we're constantly trying to impress.  When courting a girl there is a well defined protocol that needs to be respected.  You have to be polite, tactful, respectful (... it seems I am listening to my mom...), in any case there are rules that need be followed when your are trying to impress a girl and these rules go well beyond appearances.

As web developers we tend to forget some of the rules that need to be followed to make our sites more findable, more accessible, more secure, more maintainable... we typically only care about how do our pages look like in the common browsers without paying to much attention to the inner details.  You can have great content but if your markup sucks you will have issues when trying to conquer important actors like Google.

Last week I wrote a post on how the lack of use of web standards can affect your SEO efforts.  Lot's of small details that can really turn Google off.

At Aggiorno, the team is on a death march towards the release of V1.0 (soon... very soon...) and we need to relief some stress and at the same time try to educate more about the importance of good markup, the importance of following web standards on our daily work.  We came up with a video called "Get a Chance at Google" that enacts an encounter between a very content intensive web site with ... some issues...

Take a look at the video and share it if you like it.  Also, let us know what you think and if you have more ideas so this can become its own series!

Enjoy!

Be the first to rate this post

  • Currently 0/5 Stars.
  • 1
  • 2
  • 3
  • 4
  • 5

Tags: , , ,

Accessibility | Aggiorno | Web Standards

Web Standards and Search Engine Optimization (SEO) -- Does Google care about the quality of your markup?

by Federico Zoufaly 21. July 2008 12:44

There are many discussions on the Web regarding the merits of using Web standards. And typical of all discussions, there are two sides.

The argument against using Web standards can be explained with two words: who cares? If the browsers render the code correctly then mission accomplished.

The argument in favor of using Web standards claims that using cleaner code results in improved cross browser compatibility and lower maintenance costs.

I believe there is an important point that should be addressed much more emphatically. What does Google do when it encounters nonstandard compliant HTML? Does it affect your search rankings? We can’t ignore that search engine bots are "users" of our sites and they are not necessarily as tolerant with our markup as normal browsers. An SEO expert gave me the best trick to understand what Google sees and what it doesn't see while going through a page. The process goes like this:

  • Open the page you want to evaluate in your favorite browser
  • Click Select All
  • Click Copy
  • Open Notepad
  • Click Paste

Whatever gets printed in Notepad is what Google is indexing.

Now that you’re in on that trick, let’s look at some examples of how wrong HTML can affect the results of your search.

Missing Alternate Descriptions

Here’s the first example of how following standards can improve your interaction with Google. Google is "blind.” Google only sees the text that is embedded in your page. That means no images, no java script, and no animations. Providing an alternate description for non-textual information (alt attribute in XHTML) is a part of following standards. Not including an alternate description means you are missing an opportunity to provide information to Google. ALT descriptions, as they relate to Web standards, affect search rankings!

Wrong Or Missing DOCTYPE

DOCTYPE tells the browser what kind of markup to expect. Is it HTML? Is it XHTML? No DOCTYPE? Then Google simply makes a guess and not necessarily an educated one. The last thing you need is for Google (or any browser for that matter) to guess how to interpret your source code. Chris Maunder from The Code Project has an excellent example of how Google gets confused if you specify a certain DOCTYPE and then write code of a different standard,. In certain cases Google simply stops indexing the page and it assumes it is a 404 Page Not Found error. The example that Chris shows reflects how a simple miss-closed tag (ultimately a missing "/") can avoid the indexing of a page. Syntax correctness, which is enforced by using Web standards, is important if you want Google to index your page!

UPDATE: In general this is the tag soup problem. To fix it, make sure your Web site validates according to a standard like XHTML transitional.

Lacking Or Incorrect Use Of Entities

Ahh... entities... Isn't it painful to follow the Web standards rule set and escape every special character? Well it might be painful, but Google reacts to non-escaped characters in very peculiar ways. Let's first look at the most obvious one. If you write in a foreign language that requires characters with an accent, the search results may vary depending on how you code your information with entities.

Second, there are issues with escaped versus unescaped characters in URLs. This WebMasterWorld article is an example of how wrong entities usage can cause confusion.

Third, when you use scripting to generate markup, the way in which you write your script can also confuse Google as Chris Maunder explains. If you try to generate code without escaping the right characters you can get in trouble. Web standards enforce the proper use of entities. That’s just another reason to follow them and avoid search engine confusion.

Missing Required Page Elements

There are a number of page attributes that are either required or recommended by Web standards that can definitely increase or decrease your page rank. One of the suggestions that many SEO experts offer is to make sure a page contains at least the following attributes:

vh1: Every page should have one and only one h1. This tag should be used to express the main idea described on the page. In general, heading tags should not be used only for styling, but also to semantically mark the content in the page. Google pays special attention to h1 content when indexing.

title: Every page should have one and only one title. The title should be related to h1. Google looks at the relationship between h1 and title when indexing.

meta tags: Every page should have a number of meta attributes (description, keywords, etc.). Google takes these keywords into account while indexing and they also provide semantic information about the page. When properly used, these tags improve the user experience.

Web standards are a constant reminder of proper usage and this is one time that being proper is in the best interest of your search rankings.

The Separation Of Style and Content

Web standards teach you about separation between content and style, which is an incredibly useful practice in regard to improving maintainability. It also clearly has some advantages with respect to Google behavior. The first one is bandwidth savings. If your styling information is in a separate css file, since Google does not care about style, then it will now crawl it and therefore you will not be spending bandwidth in this manner. But in addition to bandwidth savings (which can be major for high trafficked sites), there is a limit to the size of a page that is indexed by search engines. So if your page is not "polluted" by styling, then it can have more content! Additionally, this is a way to avoid confusing Google if your style contains syntax errors.

UPDATE: Avoid HTML tables as a mechanism to layout the information on a table. This should be done using style markup (CSS).

Unmarked Text: No Semantics

Many times Web developers simply copy and paste text into a Web page. The resulting markup is basically text separated with BRs. As of today, I do not believe search engines penalize this behavior.In the future, it will be more important to make sure every piece of text contains as many semantics as possible. For now, the minimum semantic that a piece of text should contain is basic HTML markup like P, UL, Hx, etc. This information can help search engines understand the priority and context of the content. Plus, unmarked text is hard to style and maintain anyway.

UPDATE: There are some newer standards like microformats that can add semantic information to a page without effecting information rendering. Even if it is not clear how microformats affect search results now, presumably they will be important in the near future.

Conclusions

Hopefully it is clear that failing to follow Web standards can have a detrimental impact on your search results! Why not provide Google with the best information to index a page? Why risk Google not indexing a page at all because of syntax errors in the markup? Looking good in a browser isn’t enough these days if you want a successful Web site.

It is true that you can avoid most of the mistakes shown here without completely following Web standards, but they are useful as a guideline and as good programming rules. Next time you look at your page,let Aggiornotake over all the time-consuming tasks necessary to make a page XHTML compliant.

UPDATE: Aarron Walter just published a helpful findability strategy checklist that has sections on markup and server and client side code.

Aggiorno promotes Web standards by eliminating a lot of the tedious work that is required to validate a page. That improves the relationship between your site and search engines.

Use Aggiorno to:

Find missing alternate descriptions  

Make your code structure XHTML compliant

Convert special characters into appropriate entities

Help you with contentand style separation

Help you with text semantication

Currently rated 4.0 by 1 people

  • Currently 4/5 Stars.
  • 1
  • 2
  • 3
  • 4
  • 5

Tags: , , , , ,

Aggiorno | Web Standards

About Aggiorno

Aggiorno RSS FeedsAggiorno is a unique knowledge-encapsulation platform that can make any website a valid, findable, accessible, standards compliant one. Read on

IE8 Compatibility Wizard

Automatically upgrades your website to render correctly in IE8!

Internet Explorer 8 Compatibility Wizard

Get it today!

RecentComments

Comment RSS

Calendar

<<  September 2010  >>
MoTuWeThFrSaSu
303112345
6789101112
13141516171819
20212223242526
27282930123
45678910

View posts in large calendar

Disclaimer

The opinions expressed here in are my own personal opinions and do not represent my employer's view in anyway

Copyright 2008


ArtinSoft Corporation ArtinSoft is Microsoft Certified Partner ISV/Software Solutions and Microsft Visual Studio Partner

With over fifteen years of experience, ArtinSoft has proven to be a key player in software evolution, by allowing customers from all over the world to ensure business continuity and compliance through software migration solutions and developer tools created upon principles of artificial intelligence. At present time, ArtinSoft Corporation remains a private firm in constant growth through a strategic partner network. Read More...