Have a website? Want it to get found in search engines? If the answer to these questions is “Yes”, then it’s likely you’ve given search engine optimization (SEO) a go at some point, or at least more than a passing thought. While the impulse when getting started with SEO is often diving headfirst into keyword research and link building, there are some lesser talked about aspects of SEO that are just as important. Namely, two crucial files you should have on your site: robots.txt and sitemap.xml.
But what do these files do, and why are they so important? Read on to find out what they are, what they do, and how to add both to your site.
Before discussing robots.txt and sitemap.xml, you’ll first need to understand two relevant terms: website indexing and web crawling.
Web indexing is how search engines store and organize information about web pages across the World Wide Web. Essentially, indexing is the entire point of search engines! Where a website ranks in that index is dependent on a whole slew of SEO factors, from the aforementioned keywords to relevance and content quality.
Crawling is how search engines find pages across the web to index. Basically, each search engine has bots known as crawlers that “crawl” the web, seeking out new content or web pages to save to its index by following the links featured on each page they find.
Now that you’re an expert on all things indexing and crawling, let’s move on to what you’re here for.
What are robots.txt & sitemap.xml?
Robots.txt and sitemap.xml are essential files that can help search engines better understand your particular website and index it correctly. For this reason, robots.txt and XML sitemaps go hand in hand.
The importance of XML sitemaps
An XML sitemap is a blueprint of what you consider the most important parts of your website. While the name “sitemap” might suggest an illustrated layout of your site, it’s actually just a list of page links. Although web crawlers should be able to find the pages on your site well enough if they are properly linked (both internally and externally), an XML sitemap ensures that they will crawl and index the content you consider most pertinent and not, say, tag pages or a now irrelevant blog post from five years ago.
XML sitemaps aren’t mandatory, but they are valuable tools, particularly if you have a large website with many pages or — on the other end of the spectrum — a relatively new site that doesn’t have many external links yet.
You have the option of submitting your sitemap directly to search engines, but crawlers will be able to find it when they visit your site if you have a robots.txt file directing them to it.
What robots.txt does
A robots.txt file is a file that you can place in your website’s root directory to instruct crawlers how you want your site to be crawled. These instructions can include which pages you want them to crawl, which ones they should avoid, or instructions to block specific bots from crawling the site entirely. When crawlers visit a site, it’s the robots.txt file they usually visit first. It’s also where you should place your XML sitemap location so the crawlers can easily find it.
How to create an XML sitemap
So, now that you’re up to speed on the necessity of XML sitemaps, how do you go about creating one? For the more technically inclined who want to make one manually, Google has instructions on doing just that. There are also several free generators online, such as this one.
How to create and add a robots.txt file to your site
This can be a little tricky if you don’t have access to your website’s server. For WordPress sites, many sitemap plugins (such as the ones mentioned above) will do it for you. If you want to do it yourself, learn out how to create and upload a robots.txt file to your server with this handy how-to. When you’re done with that, check out this guide from Google explaining how to add your XML sitemap link to the file.
Robots.txt and XML sitemaps may not be at the top of your SEO considerations, but they shouldn’t be overlooked. By taking the time to create a sitemap and adding a robots.txt file to your site, you’ll have more of a say in how your website is crawled and ultimately indexed, which should have a positive impact on your overall SEO.