Just a quick post today on something interesting as it relates to how search engines catalog web pages. Search engines have millions and billions of pages of content to sift through to answer search queries. To save space, catalog more pages, and ultimately return results more quickly, search engines sift out and ignore what they refer to as “stop words”.
What Are Stop Words
“The”. “Is”. “At”. “One”.
These are a few of the most common examples of stop words. These are common words which are generally not the focus of someone’s search and can be omitted when looking for matches for their search request because they don’t necessarily add much to anything to the search. In other words, the same search could exist without them and the same message could be conveyed.
If someone does a search for the phrase “the house is in the forest”, the major subjects of that sentence are “house” and “forest” and search engines will place a greater emphasis on them to find the appropriate results.
As another example, take a shorter sentence like “the guitarist of rush”. The search engine likely doesn’t need the words “the” or “of” to find the appropriate results, so rather it can save time by focusing on “guitarist” and “rush” and you’ll get the same response you were looking for more quickly.
In some occasions, the search engine will simply replace the words “the” and “of” in that search with markers such as a * symbol. This saves the search engine a great deal of disk space.
Stop words slow down search engines, so they are named thusly because they do not “stop” to take them into consideration much of the time. It’s just one of the tricks which search engines use to get their jobs done with maximum efficiency.