Love how you just dive into the details for this Site Audit guide. Excellent stuff! Yours is much much easier to understand than other guides online and I feel like I could integrate this to how I site audit my websites and actually cut down the time I make my reports. I only need to do more research on how to remove “zombie pages”. If you could have a ste-by-step guide to it, that would be awesome! Thanks!
Robots.txt is not an appropriate or effective way of blocking sensitive or confidential material. It only instructs well-behaved crawlers that the pages are not for them, but it does not prevent your server from delivering those pages to a browser that requests them. One reason is that search engines could still reference the URLs you block (showing just the URL, no title or snippet) if there happen to be links to those URLs somewhere on the Internet (like referrer logs). Also, non-compliant or rogue search engines that don't acknowledge the Robots Exclusion Standard could disobey the instructions of your robots.txt. Finally, a curious user could examine the directories or subdirectories in your robots.txt file and guess the URL of the content that you don't want seen.
In the last year, Google and Bing have both indicated a shift to entity-based search results as part of their evolution. Google has unscored this point with rich snippets and Knowledge Graph, and Bing has now upped the ante on personal search results with Bing Snapshots. Find out how you can adopt strategies to stay ahead of the curve in the new world of semantic search results.