Metasearch Engine and Search Aggregator

Search Aggregator

A search aggregator is a type of metasearch engine which gathers results from multiple search engines simultaneously, typically through RSS search results. It combines user specified search feeds (parameterized RSS feeds which return search results) to give the user the same level of control over content as a general aggregator.

Soon after the introduction of RSS, sites began publicising their search results in parameterized RSS feeds. Search aggregators are an increasingly popular way to take advantage of the power of multiple search engines with a flexibility not seen in traditional metasearch engines. To the end user, a search aggregator may appear to be just a customizable search engine and the use of RSS may be completely hidden. However, the presence of RSS is directly responsible for the existence of search aggregators and a critical component in the behind-the-scenes technology.

Metasearch Engine

A metasearch engine (or aggregator) is a search tool that uses another search engine’s data to produce their own results from the Internet. Metasearch engines take input from a user and simultaneously send out queries to third party search engines for results. Sufficient data is gathered, formatted by their ranks and presented to the users.

However, Metasearch also has issues. Scores of websites stored on search engines are all different: this can draw in irrelevant documents. Other problems such as spamming also significantly reduce the accuracy of the search. The process of fusion aims to tackle this issue and improve the engineering of a metasearch engine.

There are many types of metasearch engines available to allow users to access specialised information in a particular field. These include Savvysearch engine and Metaseek engine. – Advanced query operators – filetype & ext

Advanced query operators – filetype: & ext: – understanding the differences

Bing offers various advanced query operators, helping customers and Bing API customers to refine their query to match their needs. Two of these operators – filetype: and ext: – appear to be same but there are subtle differences. Let’s review each to better understand them.


One of the most commonly used operators is filetype: which enables you to filter documents based on their particular filetype. Usually this operator is used to filter search results to html, txt, and pdf, as well as the primary Office document types: DOC, RTF, XLS, and PPT for Word, Excel, and PowerPoint documents. This is useful for finding official forms which are usually in PDF or DOC format… for example, 1040 filetype:pdf for the official IRS 1040 for US Taxes.


ext: is used to return the webpages of the specified file name extension only. This is also helpful for finding URLs ending in specific formats… for example, template ext:docx will filter search results to URLs having extension .docx, one of the new Microsoft Office Open XML formats introduced with Microsoft Office 2007.

So what’s the difference between filetype and ext?

The key difference between these two operators is as follows. The Bing filetype: operator is based on our classification of the content associated with URLs, and the bing ext: operator is only based on the URL file extension. You should note that filetype: covers the high level type, not individual version. In other words, filetype: DOC, DOCX and WORD will all result in the same thing (filtering to MSDOC documents).

Internet URLs can be ambiguous. For instance, URLs can be end with .pdf but they may not be Adobe Portable Document Format pdf files, as this example shows ; and the opposite is also true. URLs can end with .html and actually be a PDF file: example

Also, the Bing filetype: operator is based on our classification of the content associated with URLs, and the Bing ext: operator is only based on the URL file extension. This offers Bing customers the ability to refine their queries to match their needs. For instance, the following query template filetype:doc ext:docx combines filteype and ext and will filter search results to Microsoft Word documents having the docx Office Open XML formats.

Webmasters can also use such operator to audit files indexed within their site via filetype:pdf.


Bing Blogs –