The crawling and indexing processes lay the groundwork for search engines to rank results. Despite being fundamental aspects of how search engines operate, crawling and indexing are often overlooked or misunderstood. During our crawling and indexing session of Live with Search Engine Land, Martin Splitt, search developer advocate at Google, explained these two processes using a simple analogy about librarians.
“Imagine a librarian: If you are writing a new book, the librarian has to actually take the book and figure out what the book is about and also what it relates to, if there’s other books that might be source material for this book or might be referenced from this book,” Splitt said. In his example, the librarian is Google’s web crawler (referred to as Googlebot) and the book is a website or webpage.
“Then you . . . have to read through [the book], you have to understand what it is about, you have to understand how it relates to the other books, and then you can sort it into the catalog,” he said, explaining the indexing process. The content of your webpage is then stored in the “catalog” (i.e., the search engine’s index), where it can be ranked and served as a result for relevant queries.
Why we care. For content to be eligible to appear in search results, it must first be crawled and indexed. Understanding how crawling and indexing work can help you resolve technical SEO issues and ensure your pages are accessible to search engines.
Want more Live with Search Engine Land? Get it here: