Online Smart Search

by fzoufaly 12. May 2009 11:32

Have you ever tried to perform a structural search on web pages? A search that takes into consideration not only the text of a page but its HTML structure.

Let’s say you want to find out what meta keywords your competitors are using. The first thing you need to do is get access to the source code of each web page in your competitors site.  Next, you will probably write some sort of regular expression to identify the strings you are looking for.  Then you have to collect all the occurrences of those strings and put them in a spreadsheet and then analyze them. It sounds like a pretty time consuming activity to me. 

This is when Online Smart Search comes into play.  You just need to input an URL, a search pattern and we’ll find and tabulate all the occurrences for you.

Online Smart Search will crawl the site starting at the URL (it crawls all pages “below” the starting directory you write in the URL) and apply the search pattern to each page.  If the page contains more than one occurrence of the pattern all will be collected.  You can analyze the data immediately or you can export it to an XML file that can be easily read using excel.

The magic sauce is in the search pattern language.  Online Smart Search uses the same Search language (which is way simpler and way more powerful than a regular expression) than our desktop line of web optimization tools Smart Search and Replace feature.  Please read the full Search L anguage Specification. The Online Search not only parses all of the HTML / XHTML code, it also structurally fixes it before it starts searching.  This will ensure the best results with minimal effort even on structurally incorrect pages.

Let’s look at a few search examples:

Find all the meta keywords my competitors are using: <meta name=”keywords” content=”$KeyWords” />

Find all the meta tags and their value on a page: <meta name=”$metaName” content=”$content” />

Find all the anchor text for a given page: <a>$anchortext</a>

Find all the anchor text and the corresponding href: <a href=”$href”> $anchortext</a>

Find all the links in a page: <a/>

Get all the h1 text for a site: <h1>$h1text</h1>

Get all the text that is marked as bold: <b>$boldtext</b>

I am sure you’ll be very creative.  Let us know which searches do you find most useful.

Now, where do I find this new functionality? It is a new option for our Online Web Optimization tool. Remember we are still in Beta but I am sure you will find it very useful.

In the following screenshots you can see the input interface.  Select the URL, input the Search Pattern and click “Search”

clip_image002

The results will look like the following example.  You just need to navigate the tree structure to analyze the details.

clip_image002[6]

If you want to export the results, they will be formatted as XML:

clip_image002[9]

which can be directly imported by Excel where you can add further processing:

image

In summary, the Smart Search feature is a Find that understands markup. This feature lets you run searches on HTML code patterns within the page or across multiple pages in a site.  Just provide the URL and the Search Pattern.  You will be surprised by the usefulness of the results.

kick it on DotNetKicks.com

Currently rated 5.0 by 2 people

  • Currently 5/5 Stars.
  • 1
  • 2
  • 3
  • 4
  • 5

Tags:

HTML Search | Regular Exrpession

About Aggiorno

Aggiorno RSS FeedsAggiorno is a unique knowledge-encapsulation platform that can make any website a valid, findable, accessible, standards compliant one. Read on

IE8 Compatibility Wizard

Automatically upgrades your website to render correctly in IE8!

Internet Explorer 8 Compatibility Wizard

Get it today!

RecentComments

Comment RSS

Calendar

<<  March 2010  >>
MoTuWeThFrSaSu
22232425262728
1234567
891011121314
15161718192021
22232425262728
2930311234

View posts in large calendar

Disclaimer

The opinions expressed here in are my own personal opinions and do not represent my employer's view in anyway

Copyright 2008


ArtinSoft Corporation ArtinSoft is Microsoft Certified Partner ISV/Software Solutions and Microsft Visual Studio Partner

With over fifteen years of experience, ArtinSoft has proven to be a key player in software evolution, by allowing customers from all over the world to ensure business continuity and compliance through software migration solutions and developer tools created upon principles of artificial intelligence. At present time, ArtinSoft Corporation remains a private firm in constant growth through a strategic partner network. Read More...