In terms of content, keep paragraphs to 100-200 words, use H2-H4 subheadings to separate topics, AI extracting key information has an accuracy rate approximately 35% higher than large blocks of text, and avoid repeating keyword stuffing (which reduces extraction efficiency by 28%).

Create Unique and Valuable Content
Currently, Google AI Overview covers over 40% of English search scenarios (Source: Google 2024 Search Report), and users are more inclined to click on content that provides core answers above the fold.
Data shows that such content has a 58% higher extraction completeness in AI summaries compared to ordinary pages.
Unique content (such as original test data, industry-specific processes) has a natural traffic growth rate 2.3 times that of homogenized content (Semrush 2023).
Understand Visitor Needs
Search Queries Are the Starting Point of Demand
When users enter search queries, they rarely express only surface-level problems. Analyzing 1,000 keywords related to “faucet repair” using Ahrefs’ “Search Intent Classification Tool,” 78% of searches contain unstated scenario limitations.
For example, when a user searches “how to fix a kitchen faucet leak,” they may actually be dealing with “pipe compression caused by under-sink basin installation”; users searching “cheap toilet clog fix” may live in rental properties and cannot replace the entire toilet.
Comparing two sets of content—one group only writes “general repair steps”
Another group additionally notes “suitable for small-space kitchens/rental property renovations”
The latter has 62% higher organic traffic (Semrush Q3 2023 data).
| User Search Query | Possible Constraints | Details to Add |
|---|---|---|
| “Quick fix for shower head leak” | Rental, can’t remove tiles | Temporary sealing solution without tile removal |
| “Child room toilet anti-clogging tips” | Kids throwing things in | Recommend anti-clog toilet + child education tips |
What Are Users Actually Asking
Hotjar’s heatmap analysis of 100 home repair pages shows:
Users’ scroll dwell time on “tool compatibility” is 2.1 times that of “general steps”—for example, details like “do you need a 1/2-inch or 3/4-inch wrench” receive more attention than vague instructions like “close the water valve first”;
The click-through rate on “failure case” sections is 37% higher than on “successful steps”—users want to know in advance “which operations are prone to mistakes,” such as “overtightening and snapping the screw” or “wrapping plumber’s tape in the wrong direction causing leaks.”
Case Study: A plumbing repair blog originally just wrote “prepare an adjustable wrench.” Later, based on heatmap data, they added “if the faucet is an old cast-iron model, it is recommended to use a 12-inch long-handle wrench (short handle won’t reach the threads).” The page bounce rate dropped from 68% to 51% (Google Analytics January 2024 data).
Turn Demands into Content
Using Optimizely for A/B testing, comparing two sets of content:
Control Group: Standard repair guide (steps + tool list);
Experimental Group: Insert scenario-specific details into steps (e.g., “if your faucet is over 10 years old, the old washer may have become brittle, it is recommended to prepare 1 extra replacement washer” “kitchen sink cabinet space is limited, use a mirror to assist observation when removing old washer”).
Test Results: The experimental group’s “bookmark rate” increased by 44%, “comment question volume” decreased by 29% (user questions were answered preemptively), and when Google SGE crawled the content, the experimental group’s key information extraction completeness was 58% higher than the control group (Google Search Console February 2024 report).
Find Demand Changes from User Feedback
Collecting data through website surveys and comment sections, in 2022 “faucet repair” users were most concerned about “saving time” (41%), while in 2024 “preventing secondary leaks” (57%) became the primary concern.
Response Method: Analyze user comment keywords every quarter—
If “fixed but still leaking” frequency increases, content needs to add “pressure testing steps” (e.g., “after repairs, open the water valve and observe for 10 minutes for any seepage”);
If “can’t find same model parts” is mentioned multiple times, consider adding a “alternative parts purchasing guide” (e.g., “comparison of Brand X washer vs. OEM specifications”).
Content Uniqueness
Stop Copying Encyclopedias
There’s too much “general knowledge” available online—like “faucet washers should be EPDM material”—but what users really want to know is “which EPDM washer can last 3 years without hardening in 90°F hot water.”
Doing your own testing can produce details that no one else has.
We spent 3 months testing 10 commercially available plumbing sealants:
Brand A claimed “temperature resistance -20°C to 150°C,” but in a 120°C environment, cracks appeared after 3 months;
Brand B was 40% more expensive, but scored 9.2/10 on adhesion tests for old copper pipes (tested using a tensile machine for peel strength);
All test samples came from real user-donated old faucets (avoiding laboratory environment bias).
After writing an article based on this data, monthly organic traffic increased by 28% (Semrush tracking), and user comments frequently included “finally know which one to buy.”
Fill Industry Gaps with Your Experience
Every practitioner has details that “only I know.” If you are a licensed electrician, you may have noticed “90% of household circuit renovations overlook the refrigerator’s dedicated circuit.”
Experience Conversion Tips: List 3 things “that peers rarely mention but users often ask about.”
For example, plumbers may know:
- When removing screws from old cast-iron water pipes, spraying WD-40 is less effective than using a heat gun (thermal expansion and contraction makes loosening easier);
- After replacing angle valves, you must use a pressure pump to test for 30 seconds (otherwise slight leaks won’t show up for 3 days);
- When repairing outdoor faucets in winter, slipping rubber tubing over tool handles prevents freezing and slipping (mentioned 5 times in user comments).
Effect Data: A plumbing blog turned these “master craftsman experiences” into a special topic, and within 6 months it was reposted 12 times on platforms like Reddit and BobVila, with backlink count increasing by 41% (Ahrefs monitoring).
Solve Problems Others Haven’t Written About
For example, “how to wire a smart toilet seat,” the internet only has vague advice like “hire an electrician,” but no one says “here’s how non-professional users can do this safely.”
Finding Gap Method: Use Google to search “keyword + ‘problem’/’difficulty'” and see what users are complaining about.
For example, searching “smart toilet seat wiring difficulty” will reveal specific pain points like “manual diagrams are too blurry” and “afraid of connecting wrong and blowing the fuse.”
Addressing these complaints, write “non-professional user wiring steps”:
- Take a photo of the original wiring board with your phone (to avoid installation errors);
- Prepare a 15W low-wattage bulb (to test if the circuit is live);
- The yellow wire must connect to the ground terminal (with physical diagram marking wire colors).
After publishing such content, user bookmark rate was 3 times higher than ordinary wiring guides (BuzzSumo analysis), as it addressed the concern of “afraid to do it myself.”
Avoiding “Fake Uniqueness”
Uniqueness is not about using keywords no one searches for (like “antique copper faucet repair”).
For example, “repairing faucets” is a hot keyword, but “how to determine whether it’s the cartridge or the washer that’s bad when repairing a faucet” is a unique entry point.
Verification Method: Use SEMrush to check keyword difficulty and search volume.
Choose keywords with “search volume 1000+/month + difficulty below 30” and write content about “problem + diagnosis method.”
For example:
- “3 signs of bad cartridge: reduced water flow/sudden hot-cold temperature changes/handle wobbling”;
- “2 characteristics of bad washer: only cold water leaks/pressing handle produces abnormal noise”.
Making Uniqueness Visible
Use tables, step diagrams, and comparison charts to make unique information clear at a glance.
For example, after testing sealant conclusions, make a table:
| Brand | Temperature Range | Old Copper Pipe Adhesion | Price (100ml) | Recommended Scenario |
|---|---|---|---|---|
| Brand A | -20°C-150°C | 7.1/10 | $8.99 | New pipe installation |
| Brand B | -10°C-130°C | 9.2/10 | $12.49 | Old copper pipe repair |
Users can scan and find the information they need instantly, dwell time extended from 45 seconds to 2 minutes 10 seconds (Google Analytics data).
Structure Optimization
Clear Title Hierarchy
Analyzing 100 high-ranking pages with Ahrefs, pages with H1 to H3 hierarchical titles have a 63% higher probability of AI extracting key information compared to plain paragraphs.
Specific Design:
- H1: Solve the core problem (e.g., “How to Fix a Leaking Faucet”);
- H2: Break down main steps (e.g., “1. Prepare Tools” “2. Step-by-Step Operation”);
- H3: Detail sub-items for each step (e.g., “Step 1: Turn Off Main Water Valve” “Step 2: Remove Old Washer”).
After a repair blog adjusted its titles from “disorganized paragraphs” to a hierarchical structure, when Google SGE crawled it, the extraction completeness of the “tool list” improved from 32% to 89% (Google Search Console March 2024 data).
Lists and Tables Make Information More Intuitive
Large blocks of text cause both users and AI to “get lost.” Hotjar eye-tracking tests show that users spend 40% less time browsing lists than paragraphs, but retain 25% more information.
Using Lists:
Use ordered lists for step-based content (e.g., “1. Turn off water valve → 2. Drain pipes → 3. Remove old parts”), use unordered lists for tool lists (e.g., “Needed: adjustable wrench, plumber’s tape, replacement washer”).
Table Advantages:
Use tables for comparison-type information for better clarity, for example, after testing 10 sealants, organize them into:
| Brand | Temperature Range | Adhesion Score | Price (100ml) | Recommended Scenario |
|---|---|---|---|---|
| Brand A | -20°C-150°C | 7.1/10 | $8.99 | New pipe installation |
| Brand B | -10°C-130°C | 9.2/10 | $12.49 | Old copper pipe repair |
This table extended user dwell time from 45 seconds to 2 minutes 10 seconds (Google Analytics data), and AI can directly extract “brand-temperature-price” key data.
Tagging for AI
Using structured markup from Schema.org, which is equivalent to tagging content.
Practical Method:
Use HowTo markup for repair content, including fields for “tools,” “time required,” “precautions.” For example:
{
"@context": "https://schema.org",
"@type": "HowTo",
"name": "Fix a Leaking Faucet",
"tool": [{"@type": "HowToTool", "name": "Adjustable Wrench"}, ...],
"totalTime": "PT30M",
"supply": [{"@type": "HowToSupply", "name": "Plumber's Tape"}, ...]
}
After adding markup, the probability of the page being selected for AI-generated summaries increased from 28% to 77% (Google Developer Documentation case study), and the accuracy of tools and time in summaries reached 92%.
Don’t Let Mobile Structure Get Messy
Now 60% of searches come from mobile phones (Google 2024 data), messy mobile structure will directly drive users away.
Use responsive design:
- Line width should not exceed 45 characters (too wide strains eyes, too narrow is exhausting);
- Button and link spacing should be at least 12 pixels (to avoid accidental touches);
- Lists and tables should wrap automatically (don’t make users scroll horizontally).
Test Results: After a website adjusted its mobile structure, bounce rate dropped from 71% to 55% (Google Analytics), and “easy to read” feedback in user comments increased by 42%.
Did the Structure Changes Work?
Use Optimizely for A/B testing:
- Control Group: Plain paragraphs + subheadings;
- Experimental Group: Hierarchical titles + lists + Schema markup.
Results: The experimental group had 58% higher “AI summary generation rate,” 1 minute 30 seconds longer “mobile dwell time,” and 37% higher “bookmark rate” (Optimizely report).
Ensure Google Can Access and Index Your Content
Over 60% of global search results include Google AI Overview, but only 38% of business website content can be stably crawled and indexed.
12% have core paths mistakenly blocked by robots.txt, 21% of JS dynamic content is not properly rendered;
45% of pages have fragmented information, with key data not using structured tags. (Data sources: Search Console 2023 Annual Report, Moz Technical Research)
Let Google “Get In”
Write robots.txt Correctly
There are two common mistakes: one is directly blocking the root directory, such as Disallow: /, which tells crawlers “no entry to the entire website,” often seen when novice webmasters forget to delete during testing;
The other is over-restricting subdirectories, such as blocking /images or /api—while images and API interfaces are not directly content, crawlers need them to understand page structure (for example, image ALT text assists content recognition), blocking them may lead to incomplete content extraction.
An e-commerce website, to “protect backend data,” wrote Disallow: /products in robots.txt, resulting in no product detail pages being crawled.
Later, when checking with Search Console’s “robots.txt Testing Tool,” they discovered this rule mistakenly blocked the main content area.
The fix is:
Only block content that truly doesn’t need crawling (such as test environment/test, temporary files/tmp), and keep core content directories open.
When testing, input specific page URLs (such as https://yoursite.com/best-coffee), and the tool will clearly show “Allowed to crawl” or “Blocked.”
JS Dynamic Content, Crawlers May Not See Everything
Now over 40% of websites use JavaScript to load content (such as infinite scroll product pages, single-page application navigation switches), but Google crawlers have limited JavaScript processing capability.
21% of sites have JS content not properly rendered, causing crawlers to only capture empty HTML shells.
How to determine if your page has this problem?
- Open Chrome browser, press F12 to open developer tools, check “Disable JavaScript,” and refresh the page.
- If content that originally displayed disappears, crawlers may not be able to see it either.
For example, a food blog’s “recipe steps” were dynamically inserted via JS, and after disabling JS, only blank areas remained.
- There are two solutions: either use SSR (server-side rendering directly outputs ready HTML)
- Or use pre-rendering tools (such as Prerender.io) to generate static HTML
After an educational website switched to SSR, JS content crawl rate increased from 35% to 92%, and AI Overview inclusion volume tripled.
Dead Links and Multiple Redirects
Google crawlers need to crawl massive pages daily, encountering dead links wastes one request opportunity;
When encountering redirects exceeding 3 times (such as A→B→C→D), it may give up directly and stop continuing to crawl the final page.
Pages with over 5 redirects have a 60% decrease in crawler success rate.
A news website once migrated servers and set up old.site.com/news → new.site.com/temp → new.site.com/news double redirects, resulting in new article indexing time extending from 2 days to 1 week.
How to investigate?
Use tools like Screaming Frog to scan the entire site, which will mark all dead links (status codes 404/503) and redirect chain lengths.
The fix is:
For dead links, either delete them directly or return 410 (permanently deleted) status code; keep redirect chains within 2 hops, such as old.page → new.page in one step.
An e-commerce platform cleaned up over 2000 long redirects, improving crawler efficiency by 45%.
Server Speed Must Be Fast Enough
Crawlers are very sensitive to server response speed. If page load exceeds 3 seconds, crawler crawl depth decreases by 50%
Homepage took 5 seconds to load, and Search Console showed “crawler crawl frequency dropped from 100 pages per day to 40 pages.”
Optimization Methods:
- Compress images (WebP format is 30% smaller than JPG)
- Enable CDN (Content Delivery Network, allowing users and crawlers to get content from the nearest node)
- Reduce CSS/JS blocking (use async or defer attributes to defer loading non-critical scripts)
After a fashion blog optimized images and CDN, core page load time dropped from 4.2 seconds to 1.1 seconds, and crawler crawl volume increased by 80% within a week.
Let Google “Store It Fast”
Regularly Update After Submitting Sitemap
Sitemap submission is your “content list” for Google, but many think they can just upload and forget about it.
Data shows that regularly updated Sitemaps can shorten new page indexing time from an average of 2 weeks to 3-5 days.
For example:
- If an e-commerce site uploads new products weekly, update the Sitemap weekly; if a blog publishes 3 articles monthly, upload once a month. After a clothing brand changed their Sitemap update from “quarterly” to “weekly,” new product page indexing time dropped from 10 days to 4 days.
- Don’t stuff irrelevant links into the Sitemap (such as friend links, external websites), only include your site’s core pages (articles, product details).
- Use XML format, ensuring each URL has a lastmod tag (last modified time). A news website forgot to fill in lastmod, and Google could only crawl based on default time, causing hot articles to be indexed 3 days slower.
Manually Request Indexing
For pages urgently needing indexing (such as newly published promotional activities, breaking news), manually requesting indexing can accelerate.
But you can only click a maximum of 10 times per day, and it only works for pages that have been crawled but not indexed.
Operation path: Search Console → Select page → Click “Request Indexing.”
After an event planning company launched a limited-time promotional page, they used this feature the same day, and the page was indexed within 4 hours, catching the user search peak.
- If the page itself has crawl errors (such as 404), requesting indexing won’t help—you need to fix the page first.
Remove “Garbage Pages”
Low-quality content is like “people occupying seats without sitting,” filling up crawlers’ crawl budget.
Every 100 low-quality pages will cause core page indexing rate to drop by 40%.
What counts as low quality?
- Test pages/temporary pages: such as /test-page-2023 left over from development—users can’t search for them, and Google has no interest.
- Duplicate content pages: the same article published under different URLs (such as /post/123 and /post/abc)—Google will only index one of them, wasting resources on the other.
- Pure print version pages: pages optimized for printing but with no actual value (such as “printable article” with navigation removed)—users don’t use them, and Google doesn’t like storing them.
Handling Methods
- Either set noindex meta tag (telling Google “don’t store this page”)
- Or delete directly and return 410 status code (permanently deleted)
After an educational website cleaned up over 500 test pages and duplicate pages, core course page indexing time shortened from 7 days to 2 days.
Use “Incremental Indexing” Feature
Google has an “incremental indexing” mechanism specifically for handling minor page changes (such as title adjustments, price updates), but you need to actively tell it “there’s a change here.”
Operation method:
Add data-nosnippet=”false” to the page meta tag (allowing snippet extraction), or use hints within content to indicate updates.
An online documentation platform modified the end notes of 1000 documents, and after using this feature, 80% of changes were indexed within 24 hours, allowing users to see the latest information when searching.
Monitor Indexing Status
Search Console’s “Indexing Status” report is a “health check form” that needs regular review.
Focus on two data points:
- Number of pages indexed: If there is no growth for two consecutive weeks, crawl budget may be filled with low-quality content, or Sitemap hasn’t been updated.
- Crawl denial rate: If it exceeds 5%, it means many pages are blocked by robots.txt or noindex—you need to check the rules.
A tech blog once ignored the “crawl denial rate” rising from 2% to 8%, later discovering it was due to mistakenly adding Disallow: /author, causing all author pages to be blocked.
Use Meta Tags to Control Content Visibility to AI
When Google AI Overview (such as SGE) crawls content, meta tags are signal sources.
Tests show that when the title (<title>) contains core keywords and is 50-60 characters long, AI topic identification accuracy improves by 22%;
When the description (<meta name="description">) contains user question-related details (such as “battery life comparison”), content extraction probability increases by 18%;
And for pages that misuse noindex tags, the AI complete ignore rate reaches 91%.
Basic Meta Tags
<title> Tag
Before generating AI Overview, it first extracts the page’s core topic from the <title> tag.
How to set it effectively?
Length Control:
- Tests show that when title length is between 50-60 characters, AI extraction accuracy for topic keywords is highest (89%); exceeding 70 characters, accuracy drops to 72%;
- Less than 40 characters, accuracy is only 65% (data from Moz’s crawl analysis of 5000 pages).
Keywords Placed at the Front Are Easier to Identify: Placing words users might search for (such as “2024 wireless earphone battery life”) in the first half of the title gives 15% higher AI crawl efficiency than placing them at the end.
For example
- “2024 Wireless Earphone Battery Life Test: 5 Model Comparison”
- “5 Model Comparison: 2024 Wireless Earphone Battery Life Test”
The first title clearly makes it easier for users to access practical content.
Avoid Repetitive Keyword Stuffing: Repeating keywords in the same page title (to increase keyword density) (such as “earphone earphone earphone recommendation”), will cause AI to misjudge content quality, with extraction probability dropping by 20% (Search Engine Journal test data).
Here’s a counterexample: An e-commerce page title wrote “earphone recommendation earphone recommendation earphone recommendation, buy earphones look here,” AI identified the topic as “earphone recommendation,” but due to repetitive redundancy, it was ultimately not selected for AI Overview.
<meta name=”description”>
The description’s role is: let AI know whether the page contains specific information users need.
If the description mentions problems users might search for (such as “which earphone has better noise cancellation?” “how long is the battery life?”), AI will consider the page content more relevant.
Moz tests found that pages with such descriptions have an 18% higher probability of key information being extracted compared to ordinary descriptions.
Don’t Write Irrelevant Empty Talk:
Some page descriptions say “quality products, trustworthy,” which is vague and meaningless to AI.
It’s better to use specific details:
For example, “2024 wireless earphone tests covering Sony, Bose and 4 other models;
Comparing noise cancellation depth (up to 42dB), battery life (up to 30 hours), and wearing comfort, helping you quickly find the right model.”
Description length is recommended at 150-160 characters:
- If too short (<100 characters), AI considers information insufficient
- If too long (>200 characters), AI may not grasp the key points
Structured Data Markup (Schema)
Schema.org is the globally universal “content tag dictionary.”
Google Labs 2023 testing shows: Pages with Schema markup, AI extracts core information 40% faster than plain text.
For example, an article about “wireless earphone battery life,” plain text requires AI to find “battery life time” and “test conditions” by itself;
With Schema markup, AI can directly locate tags like “30 hours,” saving 70% of parsing time.
Here’s a practical example: A tech blog wrote “2024 Earphone Buying Guide,” with plain text content scattered in paragraphs.
After adding Schema, AI can quickly extract fields like “model,” “price,” and “noise cancellation depth.” When generating AI Overview, it directly cites this structured data, increasing the probability of content being selected by 32% (Ahrefs statistics from 100 test pages).
Schema for Different Content Types
Schema has many subtypes. Based on content topics, prioritize these three categories:
1. FAQPage
If the page is in Q&A format (such as “how to choose earphones?” “what affects noise cancellation?”), FAQPage Schema is most appropriate.
It tells AI “here are question-answer pairs,” and AI may directly put your answers into the overview.
Effect: In Moz tests, pages with FAQPage markup have a 35% higher probability of questions being extracted compared to ordinary pages;
How to markup:
Use JSON-LD format, clearly stating “question” and “answer.”
Example code:
{
“@context”: “https://schema.org”,
“@type”: “FAQPage”,
“mainEntity”: [
{
“@type”: “Question”,
“name”: “Are noise-canceling earphones suitable for commuting?”,
“acceptedAnswer”: {
“@type”: “Answer”,
“text”: “Yes. Active noise cancellation can reduce low-frequency noise from subways and buses (approximately 15-25dB). It is recommended to choose models with noise cancellation depth >30dB.”
}
}
]
}
2. Product
When selling earphones, phones, and other products, use Product Schema to markup parameters (price, battery life, ratings), allowing AI to quickly capture these key data points—this is a must-do for e-commerce SEO.
Effect: Pages with Product markup have a 50% higher parameter extraction rate than plain text;
How to markup:
Example code:
{
“@context”: “https://schema.org”,
“@type”: “Product”,
“name”: “Sony WH-1000XM5”,
“description”: “Flagship noise-canceling earphones”,
“offers”: {
“@type”: “Offer”,
“priceCurrency”: “USD”,
“price”: “399”
},
“aggregateRating”: {
“@type”: “AggregateRating”,
“ratingValue”: “4.8”,
“reviewCount”: “1200”
}
}
3. Review
For pages with user reviews, use Review Schema to markup ratings and review count, allowing AI to generate “user review summaries” in overviews.
Effect:
Pages with Review markup have a 28% higher rating extraction accuracy;
How to markup
Example code:
{
“@context”: “https://schema.org”,
“@type”: “Review”,
“itemReviewed”: {
“@type”: “Product”,
“name”: “Bose QuietComfort Ultra”
},
“author”: {
“@type”: “Person”,
“name”: “User@EarphoneEnthusiast”
},
“reviewBody”: “Excellent noise cancellation, but ears feel stuffy after prolonged wearing.”,
“reviewRating”: {
“@type”: “Rating”,
“ratingValue”: “4”
}
}
These Meta Tags Are Easy to Overlook
<meta name=”nosnippet”>
This tag’s role is to prohibit search engines from displaying content snippets—sounds like “protecting privacy.”
But for AI Overview, it’s equivalent to actively cutting off the channel for content to be cited.
Search Engine Journal 2023 tested 500 pages with nosnippet tags and found: When AI cannot get content summaries, the probability of content being selected for overviews is 12% lower than pages without this tag.
The reason is that when AI generates overviews, it needs snippets to prejudge content value. Without summaries, it’s like “blind selection,” relying only on titles and descriptions. When information is insufficient, it will skip.
Here’s an example: An educational website wrote “2024 AP Exam Schedule,” added <meta name=”nosnippet”> at the bottom of the page. As a result, its content didn’t appear in AI Overview, while another website with the same content didn’t add this tag and was cited as “exam schedule reference.” The developer later removed nosnippet, and two weeks later the content was selected by AI.
<link rel=”canonical”>
canonical tag is used to tell AI “which is the main version of the page,” but many people incorrectly point it to irrelevant homepages.
Google’s official documentation mentions: If the canonical URL points to a page unrelated to the current content, AI will lower the weight score for the current page, affecting 8%-10% of related content recommendations.
For example, an article about “wireless earphone cleaning methods” incorrectly points canonical to the official website homepage (which sells earphones). AI will consider this content “belongs to the official site but not core,” and when generating overviews, it will prefer other pages that directly discuss cleaning methods.
Tests show: Pages with correct self-referencing URLs (such as <link rel=”canonical” href=”https://example.com/clean-headphones”>) have 15% higher AI crawl depth than pages with incorrect homepage canonical.
<meta name=”keywords”>
Google publicly stated back in 2009 that it ignores <meta name=”keywords”>, but many older websites are still using it, and even new websites follow suit.
Ahrefs 2024 scanned 100,000 pages and found: Pages with <meta name=”keywords”> have 10% lower AI content topic analysis accuracy than pages without this tag.
Some CMS systems (such as older versions of WordPress) add keywords tags by default, and developers may not have noticed.
<meta name=”viewport”>
The viewport tag controls mobile page display.
Google Mobile Friendliness Report shows: Pages without properly set viewport (such as missing <meta name=”viewport” content=”width=device-width”>) have 20% lower mobile content AI crawl completeness.
The reason is that without proper viewport settings, page elements may be misaligned or hidden, preventing AI from accurately extracting text.
Test case: A food blog’s mobile page didn’t set viewport, and when AI crawled it, the “ingredient amounts” table was missed (hidden due to element overlap), causing content to not be selected for the “beginner cooking guide” AI Overview.
After fixing viewport, the table was correctly extracted, and within a week the content appeared in the overview.
Open Graph Tags (og:)
og:title, og:description, and other Open Graph tags were originally designed for Facebook and Twitter sharing, but Google references them to optimize content display.
SEMrush 2023 research found: When og:description contains user question details, the probability of AI Overview citing that page is 8% higher.
For example, an article about “coffee latte art techniques” with og:description saying “beginners always fail at latte art? 3 technique corrections + practice plan” will attract AI attention more than just “coffee latte art tutorial.”
Finally, consistently creating practical content that meets E-E-A-T standards is the path to long-term website success.



