How to Avoid Schema Markup Drift
Google structure data updates occur quite often. This is not under your control. Staying current may mean keeping your products findable, or even help you avoid a manual penalty.
Creating high-quality content is valuable to both your site visitors and Google. Structured data make it easy for search engines to learn what useful data you publish and your intent. Consider the crucial details on your product pages such as pricing, product availability, product variants, how-to details, contact information, etc.; all this needs valid schema markup to shape your knowledge base.
What is Schema Markup Drift?
Schema Drift can be a complex issue to manage. It basically encompasses strategies and tasks that handle when web page content and schema markup diverge. Typically this occurs when plugin engineers or static, hard-coded, schema.org markup falls behind either code or content updates. If we consider the meaning of the word “Drift” as a noun, its very apropos. Oxford Languages says it is “a continuous slow movement from one place to another.”
As with anything gradual, it often is easy to miss without an intentional focus.
Progress on the world web is continual. These forces press SEOs to be agile in their approach to managing semantics and schema.
Ways that markup continually evolves:
- Google introduces new features and others are no longer used.
- New versions roll out at Schema.org.
- Older Content is updated or moved and the current code breaks.
- Someone decides to make a CMS or other platform change.
- JavaScript, apps or other 3rd party entities update or are switched.
- Syndicated or automated content changes what text and elements are on the page.
- The site’s ontology changes, is merged or is migrated. (Can impact Breadcrumbs markup)
- A team member makes a content or code change without being aware of the impact.
- Business entities change their name, move, merge, or make other identity changes.
Schema.org vocabulary is removed
Scheduled Schema.org vocabulary updates often include new vocabulary opportunities while others are removed. We’ve especially seen this happen with product options and medical terms. If these changes apply to you conduct an RDF database query to uncover them. Only remove terms when you are sure the update will have a positive effect. When outdated schema terms are flagged, weigh if being archived or superseded is worthwhile.
Examples of Schema Markup to Watch for Drift
Dataset schema markup has several new possibilities pending
Dataset
schema markup was last updated on October 25, 2022. And many awesome ideas are being considered. For example, a knowledge graph of the AI innovation ecosystem, and ML Models are one kind of entities in process. As new possibilities emerge, more questions are surfacing. For example, “Can OML describe ML over semantic datasets?” While OpenML metadata uses some OML ontology, currently, neither ml-schema
nor Expose
do. (Expose is a tunnel application using PHP that allows you to share your local sites and applications with others on the internet.)
We already have dataset schema markup. If MLModel is widely adopted as a subclass of Dataset, then might a superclass Model require a markup update? There are so many scenarios like this. Several schema markup updates are close to being rolled out and others are discussion points that may shape future implementation.
Item Condition credentials changed for valid product offer
As of October 2022, both Microdata and JSON-LD valid markup for itemCondition
changed. The content value of “new” no longer validates. To use this schema type, now NewCondition
if needed within Offer
. This is typically nested under Product
schema. Fixing all product markup errors is especially important for eCommerce sites, to help keep your revenue flow up.
Clarified review schema markup
Another thing to know, Google has recently made it clearer that embedding local reviews within your web pages doesn’t impact web rankings. Meaning, there may be other more essentail structured data implementations to spend your time on. John Mueller clearly stated, “make sure not to use structured data markup on reviews not collected on your site.” This is something that I’ve asked about in the past and had fuzzy answers. While we’ve been cautious, it’s always best to have real clarity. For some pages that we’ve not updated in years; the review content needs updating to avoid schema markup drift.
Structured data is often manually added to your website code – if you employ someone with proven know-how. Several SEO plugins can assist with minimal effort. In either case, keeping an observant watch on schema requirements can save a lot of headaches.
Know What’s Occurring if a Plugin Manages your Schema
Roger Montti, a favorite author of mine, recently wrote about how the All In One SEO WordPress Plugin Vulnerability Affects Up To 3+ Million. This plugin comes with built-in support for Schema markup, as many do.
It highlights the need to be cautious about using plugins to manage your schema markup. Many times, the “drift” is significant. Having tested multiple similar plugins, I’m surprised that often the errors on their own sites using their own tools are significant. I do sympathize with the work it must take to keep an app or plugin current. I also sympathize with the three million+ active users who discovered they are vulnerable to two Cross-site scripting (XSS) attacks.
This is no trifling matter. Montti reports that “The vulnerabilities affect all versions of AIOSEO up to and including version 4.2.9.” How does this happen? Cross-site scripting (XSS) attacks are a form of injection exploit that involves malicious scripts executing in a user’s browser which then can lead to access to cookies, user sessions, and even a site takeover.
Well-maintained plugins and apps may automatically ingest current schema vocabulary versions. You will want to explore all schema opportunities that support all possible entity types. However, it takes experience to know which entity type is best for the purpose of individual pages. You want both effective and streamlined schema markup to avoid code bloat and have the best representation of your page.
Why Make Search Engines Have to Guess?
It can be a big guessing game for Google to determine the key points in your content, who’s the author, the company/organization it represents, and which data is most useful for matching search queries. Content updates are a good thing. You just shouldn’t forget that your schema needs to be updated at the same time.
Schema Markup Drift Comment by Martin Splitt
“You will have drift between your schema markup and what’s on the page with updates over time. Whoever has control over putting the content on the page has the semantic responsibility. Whether dev or SEO. Try to map whatever is on a page to broader concepts.” – Martin Splitt, Head of Google Developer Relations[1]
It is helpful to properly structure and nest your Schema so that search engines can quickly recognize the various properties of a given entity and the relational nodes between them and other entities. We prefer JSON-LD, which annotates elements on a page, structuring the page’s data, which can then be helpful to search engines.
A key advantage of implementing structure data markup is disambiguating page elements and establishing qualities and facts surrounding entities, which is then associated with creating a more organized, better web overall. However, the advantages of structure data markup can be quickly lost if past code becomes redundant.
Typically your core name, address, and phone number will change less often. For these foundational business details, maintaining accurate Organization structuted data or Local Business schema is vital.
Why is Avoiding Schema Markup Drift Important?
Schema increases your chances of winning rich results: Google search engine result pages (SERPs) are continually adding new ways to nap highly visual placements. For, example, the popular “People Also Ask” boxes and “People Also Search For” featured snippets can include links directly to your website.
Improves your website quality and E-E-A-T: On its own, structured data is not a direct ranking factor. However, Google consistently recommends its usage because it helps its search engine know what’s on your site. This way of educating Google better about your content and the entities included on your site through structured data also simplify and improve their assessment tasks of your website’s quality and E-E-A-T.
You may have some reason to simply what to change the schema of an existing article. Be aware that abuse of schema can trigger an algorithmic or a manual penalty. Schema drift and onpage changes may go unnoticed for some time; however, it is not worth losing your hard-won rankings and getting dropped in Google search.
Structured Data Feeds Your Google Knowledge Graph
Whether you manually manage your schema or automate structured data, it has many uses.
Google’s Knowledge Graph sphere is rapidly commanding more SERP space. While Google gathers information in many ways that we don’t understand, we know that schema code feeds its bots’ information.
In 2018 Google began talking about its Topic Layer. This aids its machine learning to better assess web content and how topics and subtopics work together. Google stated the following:
“So we’ve taken our existing Knowledge Graph—which understands connections between people, places, things and facts about them—and added a new layer, called the Topic Layer, engineered to deeply understand a topic space and how interests can develop over time as familiarity and expertise grow. The Topic Layer is built by analyzing all the content that exists on the web for a given topic and develops hundreds and thousands of subtopics. For these subtopics, we can identify the most relevant articles and videos—the ones that have shown themselves to be evergreen and continually useful, as well as fresh content on the topic. We then look at patterns to understand how these subtopics relate to each other, so we can more intelligently surface the type of content you might want to explore next.” – Helping you along your Search journeys
Both Google’s structured data adoption and the ever-evolving Schema.org library are fantastic advantages that can guide how your content can be better structured. It can help your site resolve content ambiguity and disorganized data.
Basically, schema drift confuses your content for search engines. It is a data quality issue. Think of it as your data layer that informs Google’s topic layer. Managing Data Quality becomes a bigger task along with more publications and pages over time. Schema is meant to serve a specific purpose. Structured content via data quality is super helpful; but when it drifts from accuracy, it is a data problem.
Your schema needs credibility. Establish an internal best practices workflow to manage schema changes. Monitoring data quality matters.
How do you Detect Schema Markup Drift?
Schema drift detection methods:
Google Search Console Reports
Since we are keen to win Google rich results for clients, keeping a keen eye on Search Console Reports is a priority task. This Google tool caches versions of web pages over time depending on your crawl rate. So, if you are frequently making changes to your markup, the latest markup may not reflect what’s on your site.
Screaming Frog Custom Reports
Mapping schema markup during a site migration is a huge task. Often even slight content changes during migration impact schema. For example, if someone changes question-answer content, it may throw off your FAQ schema markup. Check every system to derive the desired database schema from all migration files, and then compare it with the actual schema in the live database to detect the drift.
Microsoft Schema Drift in Mapping Data
To enable schema drift control, select “Allow” schema drift in your sink transformation. Once schema drift is enabled, ensure the Auto-mapping slider in the Mapping tab is turned on. With this slider on, all incoming columns are written to your proper destination. Otherwise, you must use rule-based mapping to write drifted columns.
How to Avoid Schema Markup Drift from Occurring?
Avoid Schema drift from occurring by:
- Designate one individual to responsibly monitor schema.
- Take a full team approach.
- Check for external schema drift.
- Invest in schema analyzer tools.
- Be selective, strategic, and streamlined in schema management.
Designate one individual to responsibly monitor schema.
Tasks like this may get missed or delayed without specification in both schedules and who is accountable for this oversight. Often SEOs are overloaded with opportunities and tasks, so this only works if they can designate sufficient time each week to complete checks and communicate with drift occurs.
Take a full team approach.
Once you’ve identified ways that drift can occur, set up a notification system so that anyone who makes an update that changes your data synchronization communicates it to others. External webpage containers, plugin configurations, themes, etc. if changed, may cause issues. Multiple plugins that influence your markup may cause conflicts or redundancies.
Every team member that may tweak your schema should:
- Align on the same language for each schema property.
- Align on the same structure for adding values, including date and time formats.
- Share the same syntax for every property, such as capitalization, hyphens, and punctuation.
- Use the same encoding format per property (example: character encoding).
Check for external schema drift
Sometimes we’ve combined schemas, for example, from WordLift or Yoast with our custom formats. In some cases, your markup is then hosted on a third party. While these additive schema markup methods have benefits, observe that @id
can drift and contribute to external schema drift.
Invest in schema analyzer tools
Some apps and platforms provide a Schema Analyzer that can conduct scheduled crawls of sites to view your schema data’s overall health. Many provide visualized reports after querying the data (RDF triples) for deprecated schema properties (Suggestion: Screaming Frog Crawler). Make use of linked open data to stay relevant.
Be selective, strategic and streamlined in schema management
Strike a good balance between providing insufficient Data to search engines and becoming a “Noisy Environment”. One might think that by using multiple schema markup plugin managers, you may accomplish what a single platform provides. That is possible. However, risks of creating excessively noisy code exist. We suggest making a strategic choice and developing open communication for premium support from that third party. This often requires paying for a premium version.
Two main sources to follow to keep informed up schema markup changes:
If this task looks a bit time-consuming, we won’t say you’re wrong. However, one thing SEOs agree on in general is that search marketing remains a time-intensive strategy. No single version of search engine optimization will quickly and immediately guarantee you first-page search result rankings. Only paid search options that allow you to bid on first-page ad spots.
Regular audits that maintain valid schema markup on your web pages will catch when schema drift occurs.
SUMMARY: You don’t Want to be Derailed by Schema Markup Drift
If you’d like help keeping your schema accurate, call 651-205-2410. We’d love to help! We’re passionate about providing custom SEO schema markup solutions to solve your unique needs. For example, we’ll evaluate your website’s crawl budget, code duplication, and possible intent disambiguation issues. We’re also familiar with the most common mistakes that prevent Google from indexing schema markup. Your data quality matters and we can help.
Audits Protect from Structured Data Markup Drift
References:
[1] https://webinars.searchenginejournal.com/search-engine-journal/Q-A-With-Google-s-Martin-Splitt-Semantic-HTML-Search-Google-Search-Console