Have you ever tried to perform a structural search on web pages? A search that takes into consideration not only the text of a page but its HTML structure.
Let’s say you want to find out what meta keywords your competitors are using. The first thing you need to do is get access to the source code of each web page in your competitors site. Next, you will probably write some sort of regular expression to identify the strings you are looking for. Then you have to collect all the occurrences of those strings and put them in a spreadsheet and then analyze them. It sounds like a pretty time consuming activity to me.
This is when Online Smart Search comes into play. You just need to input an URL, a search pattern and we’ll find and tabulate all the occurrences for you.
Online Smart Search will crawl the site starting at the URL (it crawls all pages “below” the starting directory you write in the URL) and apply the search pattern to each page. If the page contains more than one occurrence of the pattern all will be collected. You can analyze the data immediately or you can export it to an XML file that can be easily read using excel.
The magic sauce is in the search pattern language. Online Smart Search uses the same Search language (which is way simpler and way more powerful than a regular expression) than our desktop line of web optimization tools Smart Search and Replace feature. Please read the full Search L anguage Specification. The Online Search not only parses all of the HTML / XHTML code, it also structurally fixes it before it starts searching. This will ensure the best results with minimal effort even on structurally incorrect pages.
Let’s look at a few search examples:
Find all the meta keywords my competitors are using: <meta name=”keywords” content=”$KeyWords” />
Find all the meta tags and their value on a page: <meta name=”$metaName” content=”$content” />
Find all the anchor text for a given page: <a>$anchortext</a>
Find all the anchor text and the corresponding href: <a href=”$href”> $anchortext</a>
Find all the links in a page: <a/>
Get all the h1 text for a site: <h1>$h1text</h1>
Get all the text that is marked as bold: <b>$boldtext</b>
I am sure you’ll be very creative. Let us know which searches do you find most useful.
Now, where do I find this new functionality? It is a new option for our Online Web Optimization tool. Remember we are still in Beta but I am sure you will find it very useful.
In the following screenshots you can see the input interface. Select the URL, input the Search Pattern and click “Search”
The results will look like the following example. You just need to navigate the tree structure to analyze the details.
If you want to export the results, they will be formatted as XML:
which can be directly imported by Excel where you can add further processing:
In summary, the Smart Search feature is a Find that understands markup. This feature lets you run searches on HTML code patterns within the page or across multiple pages in a site. Just provide the URL and the Search Pattern. You will be surprised by the usefulness of the results.