What is the Issue?
Based on my experience, canonical tags are one of most often overlooked HTML elements.
Canonical tags are a relatively new html element introduced by Google in the 2000s. They introduced it to help webmasters help Google determine the right page on a website to index. Given dynamic parameters in urls, multiple urls for the same page, multiple urls with the same content, etc., there is a plethora of urls for Google to find and potentially index. Google can capture all of the urls but it has trouble then selecting from this potentially massive set of links the right one to serve in their search results. So Google allowed the webmaster to choose for them.
As a result of the introduction of canonical tags, this cuts out a vast majority of duplicate content issues and consolidates links from the derivative urls into the canonical urls. Yea!
Why Is It Important?
Duplication issues are like a sprinter forgetting his trainers at the hotel. You’re at a huge disadvantage out of the gate. With duplication problems, search engines don’t know what page to rank. Duplication causes issues with indexing, relevancy, and authority and ultimately, it deeply discounts the page’s ranking potential.
How to find out if there is an issue with your canonical tags?
There are three scenarios that I see often:
Scenario #1: The canonical tag is on the page, but the canonical url points to an irrelevant url, on nearly every page.
Scenario #2: The canonical tag is not in the head tag. Big mistake!
Scenario #3: The canonical tag is generated by Javascript and Google has a harder time finding it.
BTW - When Google doesn’t find a “user-defined” canonical tag, it selects one for you. When Google has to select it for you, this process can take time so you could see pages sitting in queue awhile to get indexed, but more importantly it may not be your desired page. If this happens, this should be a red flag and you could investigate it further.
How to check step by step if the right canonical tag is on the page?
There are a variety of approaches to ensure that the right canonical tag is properly in place, but I’ll attempt to show you the two basic options that gets you “hands on keyboard” and looking through and inspecting the code.
Once, you understand what you should expect for one page, it’ll be easier to scale this across multiple pages with more robust tools, eventually as your disposal.
Option 1: Manual Inspection
Select a page on your website.
It can be any page, including the homepage. My preference would be the next most visited page after the homepage, since the homepage’s canonical tag url could be likely correct, on accident, then any other page on the website.
Open the view source, by right clicking the mouse on a page. This can sometimes be tricky. You have to make sure that your mouse hovers over html/css area and a javascript area on teh page so the “page source” menu item appears for you to click on. If it works correctly, your web browser will open a new tab and a page with html markup will display.
Hit Command+F to prompt the find function on the page and type in “canonical” to locate all of the instances where the code references a canonical. Is there one (and only one) canonical tag present?
If the answer is no because you find more than one canonical tag in the html, then you need to work with your developer to remove the extra canonical tag(s) from the pages.
There should be only canonical on the page and specifically within the html head section.
If the answer is no because you couldn’t find any canonical tag, then go to the inspect tool on the original page selection and check for it the same way. It make be set as a Javascript-based tag only. More on the inspect tool later.
If the answer is yes and you can see it in the source code, then it should look like the below, but next ask yourself, is it in the html head tag?
If not, then work with your developer to move the canonical tag to into the html head tag.
If it is in the html head tag, then is canonical tag url referencing the root url of the page that is on? By root url, I mean the page url that is void any parameters.
If yes, then let’s discuss whether that should be the case. It should be the root url of the page itself 9 out of 10 times.
If the answer is no, find out why because given its position in the site architecture, it should reference itself. It might not be for a good reason, perhaps there was a decision to not index it, in favor of another page.
Option 2:
Run a crawl of the site via Screaming Frog (or via your preferred SEO web scraping tool). It will automatically look for the canonical tag, but be careful. Some tools like Screaming Frog won’t tell you if it’s in the right location (the html head tag). To be on the safe side, open the view source or inspect element tool and look into the head tag and see if the canonical tag is within the html head brackets.
i. If it’s not in the right location, let the developer now.
BTW - Regarding the inspect tool, you have to be careful. To ensure that the search engine will find the canonical tag or any important SEO element, when looking through the inspect element, make sure you don’t need to interact with the page, such as hovering the mouse over a section of the page, before the element appears in the code. If you must interact with the page, then it signals the link or the copy depends on js to see it. Red flag! See screenshot for an example of when I didn’t need to interact with javascript to find the canonical tag.
Here’s how to fix the following issues:
Misplaced canonical tags: If you see the canonical tags outside of the head tags, request the development team to apply a canonical tag component into each page of the site, specifically within the head tag.*
Missing canonical tags: If you can’t find the canonical tags, notify the developer. Sometimes, the canonical tag is available in the CMS but if it isn’t filled in, you might not see it displayed in the html. If that’s the case, then the developer doesn’t have to implement a new html canonical tag element but rather he just needs to point the content manager to where he or she can fill in the canonical tag with the appropriate url. For large scale e-commerce sites with thousands of urls, however, the developer should assist with setting rules that will auto-generate the canonical tag urls, regardless of where the canonical tags are available or not.
Wrong canonical tag URL: As you review, you should see only only root urls (like www.example.com/home-school.html) and not a dynamic url (like www.example.com/home-school.html?source=ppc&medium=search) as the canonical tag urls. If you see a dynamic url, you should switch them out to the root url (a url without a question mark “?” with a series of parameters after the question mark.
Also, make sure that the canonical urls that are not referencing the same page that it is on are correct. For example, you open the source code on www.example.com/wedding/product123 see the canonical tag url as www.example.com/engagement/product123. We call these pages “canonicalized” pages, meaning the search engines should not index the page with a non-self-referencing canonical url and index only the page referenced in the canonical tag. If you believe the page as an incorrect canonical url, then flag and it either update the canonical tag in the CMS or notify the developer to update the canonical tags. I would find the site and collect any pages with the wrong url usually by leveraging Screaming Frog, and then provide the take canonical url for the page and send it back to the development team stating that which urls should be inserted into the canonical tag within the same url’s page.
** For ecommerce, you’ll run into duplicate pages and you have to find a rule that you can apply to all canonical tags or you need to manually update them.
*The location is important because if Google doesn’t see it in the head tag, then it will likely miss it on the page. Maybe the crawl budget runs out and it doesn’t make it far enough down the page to see it.