微信客服
Telegram:guangsuan
电话联系:18928809533
发送邮件:[email protected]

How to use Screaming Frog for SEO | 2025 User Guide

作者:Don jiang

Anyone who doesGoogle SEOknows that tools are the lever of efficiency. Taking Screaming Frog as an example, this crawler tool can​complete 8 hours of manual work in 20 minutes​​: it can crawl every URL on your website, accurately identifying 80-120common SEO issues(such as404 dead links, duplicate titles, images missing Alt attributes).

This article takes you from installation setup to data implementation, turning Screaming Frog into your “SEO microscope.”

How to Use Screaming Frog for SEO

Installation and Basic Settings

Installing Screaming Frog sounds like a simple operation of “clicking next a few times,” but some users reported that due to not paying attention to system compatibility during installation, the Mac version ran sluggishly, with crawling speed 40% slower than normal;

Others randomly set crawl depth, resulting in a small website being crawled for 2 hours without capturing all core pages.

Pre-Installation Preparation

1. System Compatibility

Screaming Frog supports Windows 10/11 (64-bit) and macOS 10.15 and above. If your computer is Windows 7 or macOS 10.14, directly downloading the installer will prompt “incompatible,” and forcibly running it may cause crashes (actual test: Win7 users have approximately 35% crash rate).

2. Permission Issues

  • Windows: It is recommended to install using an administrator account (right-click installer → “Run as administrator”), otherwise insufficient permissions may prevent writing crawl data (common error: “Unable to save log file”).
  • Mac: Do you need to disable “System Integrity Protection” (SIP)? No, but on first run, you may need to click “Still Open” in “System Preferences → Security & Privacy,” otherwise it will be blocked (approximately 20% of Mac users get stuck at this step).

3. Network Environment

Turn off proxy software (such as VPN, accelerator) before crawling,​​local network latency exceeding 200ms will cause crawl speed to decrease by 50%​​(actual test: at 200ms latency, 10 items per second; at 50ms latency, 25 items per second).

Official Installation

Windows System

  1. Visit the Screaming Frog official website (www.screamingfrog.co.uk), click “Download Free Version” (the free version is sufficient for small to medium websites);
  2. Select “Windows Installer,” double-click to run after download is complete;
  3. Follow prompts to select installation path (C drive default is recommended to avoid custom paths causing configuration file loss), check “Create desktop shortcut,” click “Install”;
  4. After installation is complete, a green spider icon will appear on the desktop, double-click to open.

macOS System

  • Download from the official website, select “macOS DMG”;
  • Double-click the downloaded .dmg file, drag the “Screaming Frog SEO Spider” icon into the “Applications” folder;
  • On first opening, the system may prompt “Unable to open because it comes from an unrecognized developer,” go to “System Preferences → Security & Privacy,” click “Still Open” to proceed.

4 Basic Settings

After installation is complete, the first time you open the software, you need to configure “Spider” parameters.

If settings are wrong, all subsequent crawl data may be useless​​.

User Agent

  • ​Function​​: Tells the website server “who I am,” Google’s crawler user agent is “Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)”.
  • ​Setting Method​​: Click the top menu「Configuration → Spider」, select “Googlebot” from the “User Agent” dropdown (default is “Screaming Frog”).
  • ​Why It Matters​​: If using the default “Screaming Frog” user agent, some websites will block the crawler (for example, setting “Disallow: /screamingfrog”), preventing content from being fetched; using “Googlebot” can simulate real Google crawler, getting crawl data closer to reality (actual test: after switching, an e-commerce site’s crawl success rate increased from 65% to 92%).

Crawl Depth

  • ​Definition​​: Starting from the homepage, how many layers of links to click at most (for example, homepage → category page → product page is 3 layers).
  • ​Setting Recommendations​​:
    • Small to medium websites (pages ≤1000): Set to 5 layers (covers over 90% of core pages);
    • Large websites (pages >1000): Set to 10 layers, but needs to be combined with “Limit crawl count” (see below), to avoid excessive crawl time (10 layers may extend crawl time from 10 minutes to 1 hour).

Limit URL Crawl Count (Max URLs to Crawl)

  • ​Function​​: Prevents the software from continuously crawling without stopping due to too many website links (such as forums,infinite scroll pages).
  • ​Setting Method​​: In「Configuration → Spider」, check “Limit number of URLs to crawl,” enter a specific value (5000-10000 for small/medium sites, no more than 50,000 for large sites).
  • ​Consequences of Not Setting​​: A user once crawled an e-commerce site with “recommended products”dynamic linkswithout setting a limit. The software crawled for 24 hours, ultimately capturing 230,000 URLs (of which 80% wereduplicate product detail pages).

Exclude Parameters

  • ​Problem​​: Many website URLs carry unnecessary parameters (such as ?utm_source=weibo, ?page=2). These parameters don’t affect content but will be identified by Screaming Frog as different URLs, causing duplicate crawling (for example, “product page” and “product page?page=2” will be counted as 2 URLs).
  • ​Setting Method​​: Click「Configuration → Exclude」, enter parameters to filter in “Query Parameters” (separated by commas), such as “utm_source,utm_medium,page”.
  • ​Effect​​: After an education website filtered 12 tracking parameters, the number of crawled URLs decreased from 12,000 to 4,500, and crawl time was shortened by 40%.

Run a “Small Crawl” on the Homepage First

After settings are complete, don’t rush to crawl the entire site—first enter the homepage URL, click “Start” to run a​​small-scale test​​(limit crawl count to 100), check 3 things:

  1. Are key pages missing from crawl​​: For example, are “About Us” and “Contact Us” in the homepage navigation being captured (search for keywords in「Internal」report);
  2. Are there duplicate URLs​​: In「URL」report, check if there are different parameter versions of the same page (such as “/product” and “/product?color=red”);
  3. ​Are 404s triggered​​: Check 404 status codes in「Response Codes」, confirm no deleted pages are being crawled (such as old event pages).

If problems are found, return to「Configuration」to adjust parameters (such as increasing crawl depth, adding exclude parameters), then test again.

Quick Start a Basic Crawl

Many people think “clicking start” is the entirety of crawling, but in reality​​30% of people end up with invalid data due to ignoring details​​.

For example: someone didn’t check the network before starting, resulting in getting stuck halfway due to high latency; someone didn’t set limits, causing the software to crawl for 2 hours still repeatedly crawling; someone entered the wrong URL format, directly getting “0 results”.

3 Pre-Start Checks

1. Confirm Basic Settings Are Complete

  • ​User Agent​​: Must be set to “Googlebot” (check in「Configuration → Spider」), otherwise may be blocked by websites (actual test: without setting, a corporate website’s crawl success rate was only 45%; after setting, it increased to 90%).
  • ​Crawl Depth​​: Adjust according to website size (5 layers for small/medium sites, 10 layers for large sites), to avoid crawling too shallow and missing key pages, or crawling too deep and wasting time.
  • ​Exclude Parameters​​: Filter unnecessary tracking parameters (such as ?utm_source), to reduce duplicate URLs (without filtering, an e-commerce site’s URL count was 3 times the actual count).

2. Test Network Stability

  • ​Latency Requirements​​: Local to target website latency should preferably be ≤100ms (use「ping target domain name」command to test).
    • Latency ≤100ms: Can crawl 20-30 URLs per second;
    • Latency 100-200ms: Crawl 10-15 per second;
    • Latency >200ms: Crawl <10 per second, crawl time will double (for example, 1000 URLs, low latency takes 10 minutes, high latency may take 25 minutes).
  • ​Avoid Interference​​: Turn off VPN, accelerator, or download tools (actual test: with Thunder download running, crawl speed decreased by 60%).

3. Confirm Target Website Is Accessible

  • Directly enter the target URL in the browser (such ashttps://example.com), check if it opens normally (avoid crawling pages with “403 forbidden access”).
  • If the website has login restrictions (such as membership system), you need to log out in advance (Screaming Frog cannot handle login status, it will crawl blank pages or 403 errors).

4 Steps to Operate, Get Results in 10 Minutes

1. Enter Target URL

  • ​Format Requirements​​: Must enter a complete URL (including http:// or https://), otherwise the software will error “invalid URL”.
    • Example: Correct input「https://www.example.com」, wrong input「www.example.com」or「example.com」.
  • ​Multiple Domain Handling​​: If crawling multiple related domains (such as www and m sites), need to start crawling separately (Screaming Frog can only crawl one domain at a time).

2. Set Restrictions (Optional But Recommended)

  • ​Limit Crawl Count​​: In「Configuration → Spider」, check “Limit number of URLs to crawl,” enter a value (5000-10000 for small/medium sites, no more than 50,000 for large sites).
    • Function: Prevents infinite crawling due to dynamic links (such as “Load More”) (a user once didn’t set this, crawled for 24 hours capturing 230,000 duplicate pages).
  • ​Exclude Specific Pages​​: Add “Disallow” rules in「Configuration → Exclude」(such as “/admin/” backend pages), to avoid crawling irrelevant content.

3. Click “Start,” Observe Real-Time Status

  • ​Progress Bar​​: Top progress bar shows overall crawl progress (green=normal, yellow=slowing down, red=stuck).
  • ​Status Bar​​: Bottom right corner shows “X crawled, Y pending, Z per second”.
    • Normal: Speed stable at 10-30 per second (at low latency);
    • Abnormal: Speed suddenly drops to 0 or 1 per second, may be server restrictions (such as triggering “anti-crawl mechanism”) or network issues.

4. Mid-Process Problem Handling

  • ​Stuck / Not Moving​​:
    • Check network: Re-ping the target domain, confirm if latency suddenly increased;
    • Manual interrupt: Click「Stop」button, wait 10 seconds then restart (some servers temporarily block IP, may recover after restart);
    • Bypass restrictions: If crawling “403 forbidden access” pages, try modifying user agent to “Bingbot” in「Configuration → Spider」(some websites have looser restrictions on Bingbot).

Crawl Complete

After crawling ends, the software will pop up “Crawl Complete.” At this point, do 3 things to confirm data quality:

1. Check If Total Crawl Count Is Reasonable

  • ​Calculation Method​​: Small sites (within 100 pages) usually crawl 50-200; medium-large sites (within 1000 pages) crawl 500-3000 (specific depending on link complexity).
  • ​Abnormal Situations​​:
    • Crawl count=0: May be URL format error, network completely disconnected, or website blocked Googlebot;
    • Crawl count much less than expected: May be crawl depth set too shallow (for example, set to 2 layers but core pages are on layer 3), or blocked by robots.txt (check “Robots.txt blocked” in「Directives」report).

2. Check If Key Pages Were Crawled

  • ​Operation Method​​: Click「Internal」in the left menu → search for corepage keywords(such as “product,” “About Us”), confirm if they appear in results.
  • ​Example​​: If the goal is optimizing “new phone” page, search “new phone” with no results, may be that page’s link is too deep (exceeding set crawl depth), or link is invalid (showing 404).

3. Check If There Are Large Number of Error Status Codes

  • ​Focus On​​:
    • 404 (dead links): If more than 10 appear, record specific URLs (use「Response Codes」report to export later);
    • 500 (server error): Single 500 may be temporary fault, large number of 500s need to contact website technical staff for investigation;
    • 301/302 (redirect): Check if redirect target is valid (for example, redirect to 404 page or unrelated page).

SEO Report Interpretation (Focus on These 6)

People doing SEO often say “data doesn’t lie,” but among Screaming Frog’s dozens of reports,​​information affecting Google rankings is in 6 reports​​.

We’ve calculated: After processing these 6 types of issues (not involving complex content creation), small/medium website indexing rates can increase from 65% to 85%, and organic traffic increases by an average of 20%.

Response Status Codes Report

This report records the HTTP status code of each page,​​if the status code is wrong, crawlers may skip your page directly​​.

Key Data and Operations

  • ​200 (Normal)​​: Ratio should be >85% (for small/medium sites). If below 80%, many pages may be blocked or content errors.
  • ​404 (Dead Link)​​: Common for pages deleted without cleaning up links (actual test: e-commerce sites’ 404 ratio is generally 8-12%).
    • Operation: Export 404 URL list → check link sources (navigation/internal links/external links) → delete invalid links or set 301 redirect to related page.
  • ​301/302 (Redirect)​​: Ratio >5% requires vigilance (may be old version pages not updated).
    • Operation: Check if redirect target is valid (avoid redirecting to 404 page or unrelated page), prioritize using 301 permanent redirect (transfers authority).
  • ​500 (Server Error)​​: Single occurrence may be temporary fault, >3% needs technical investigation (such as code errors).

​Example​​: A corporate website processed 12 404 dead links (all old event pages), and after deleting internal links redirecting to these pages, the crawler’s daily crawl volume increased from 800 to 1200.

URL Length and Structure Report

Google’s crawler has limited “patience” for long URLs,​​the longer the URL, the lower the probability of being completely crawled​​.

Key Data and Operations

  • ​Length Distribution​​: Statistics show approximately 20-30% of URLs exceed 100 characters (ideal is <80 characters).
    • Operation: Filter “Length >100” URLs → shorten paths (such as “/product?id=123” change to “/red-running-shoes-123”).
  • ​Dynamic Parameters​​: URLs with more than 3 parameters (such as “?id=123&cat=456&sort=date”) with ratio >15% need optimization.
    • Operation: Merge duplicate parameters (such as “?utm_source=weibo&utm_medium=sina” simplified to “?ref=weibo”), or use static links instead.

​Comparison​​: An e-commerce site changed “/product?category=shoes&brand=nike&id=123” (102 characters) to “/nike-shoes-123” (45 characters), and that page’s indexing status changed from “not indexed” to “indexed.”

Title Tag Report

Title is the core basis for Google to judge page topic,​​duplicate or invalid titles will directlylower rankings​.

Key Data and Operations

  • ​Duplication Rate​​: Approximately 30-40% of pages have duplicate titles (such as multipleproduct page SEOtitles are all “Product Details”).
    • Operation: Filter “Duplicate Titles” → add unique identifier for each page (such as “[Product Name]-[Brand]”).
  • ​Length Distribution​​: Ideal length is 50-60 characters (Google default truncation is 600 pixels, approximately 60 characters). Statistics show approximately 25% of titles exceed 60 characters (will be truncated).
    • Operation: Filter “Length >60” → shorten content (keep core keywords, delete redundant modifiers).

​Case Study​​: An education website changed a course page title from “Course Introduction” to “2024 Python Beginner Course-XX Education (with Learning Materials)” (expanded from 20 to 45 characters), and thatpage’s click-through rateincreased from 1.2% to 2.1%.

Meta Description Report

Meta description doesn’t directly affect ranking, but​​determines whether users click on your page​​(Google matches description with user search intent).

Key Data and Operations

  • ​Missing Rate​​: Approximately 15-20% of pages have no meta description (crawler will automatically crawl page content to generate, but quality is unstable).
    • Operation: Filter “No Meta Description” → manually write (control at 150-160 characters).
  • ​Length Distribution​​: Approximately 25% of descriptions exceed 160 characters (will be truncated), 10% are too short (<120 characters, insufficient information).
    • Operation: Filter “Length >160” or “Length <120" → supplement information users care about (such as "30-day free trial," "authentic guarantee").

​Data​​: Ane-commerce site optimizationof 200 product page meta descriptions (adding keywords like “limited-time discount,” “free shipping”) resulted in an average 15% increase in organic clicks for those pages.

H1 Tag Report

H1 is the page’s main title,​​Google judges page core content through H1​​(a page should preferably have only 1 H1).

Key Data and Operations

  • ​Quantity Anomaly​​: Approximately 10-15% of pages have no H1 (content lacks core title), 5% have multiple H1s (content topic confusion).
    • Operation: Filter “No H1” or “Multiple H1s” → add main title for pages without H1 (such as product name + core selling point), delete extra H1 tags.
  • ​Content Relevance​​: Approximately 30% of H1s don’t match page content (such as H1 writes “Summer Sale,” but page is actually winter coats).
    • Operation: Filter “Content Mismatch” → modify H1, ensuring consistency with page core content (such as “Winter Fleece Coat-XX Brand 2024 New Arrival”).

​Effect​​: A clothing brand optimized H1s for 100 product pages (changed from “Product Details” to “Fleece Hoodie-Men/Women”), and the average dwell time on those pages increased from 45 seconds to 70 seconds (users find needed information more easily).

Image Alt Attribute Report

Alt attributeis the text description of images,​​missing or keyword-stuffed Alt wastes image search traffic​​(approximately 30% of users find content through image search).

Key Data and Operations

  • ​Missing Rate​​: Approximately 40-50% of images have no Alt attribute (especially product images, detail images).
    • Operation: Filter “No Alt Text” → add descriptions (such as “close-up of red athletic shoe side breathable mesh”).
  • ​Keyword Stuffing​​: Approximately 10-15% of Alts contain duplicate keywords (such as “athletic shoes athletic shoes athletic shoes men”).
    • Operation: Filter “Keyword Stuffing” → modify to natural descriptions (such as “men’s breathable athletic shoes-mesh design”).

​Case Study​​: A sports brand added specific Alt attributes to 200 product images (such as “men’s size 42 running shoes-lightweight cushioning”) and image search traffic increased by 25%.

Batch Check Internal Link Issues

We’ve calculated: For websites without batch-checking internal links, an average of 15-20% of pages cannot be effectively indexed due to internal link issues; after processing these internal link issues, crawl volume for related pages can increase by over 30%.

Batch checking is not “checking links one by one,” but using Screaming Frog’s “Internal” report to quickly find issues.

Dead Link Internal Links

Dead link internal links refer to links on pages pointing to deleted or inaccessible pages (status code 404).​

When users click such links, they jump out directly, and crawlers also reduce crawling of those pages due to frequently encountering 404​​.

Data and Operations

  • ​Common Sources​​: Navigation bar (30-40%), old article recommendations (25-30%), user input in comments (15-20%).
  • ​Detection Method​​:
    • Click「Internal」in the left menu → click「Status Code」column and filter “404”;
    • Export results (right-click → Export → Selected), use Excel to count “Source URL” (source page) and “Target URL” (target page).

​Case Study​​: An education website’s navigation bar had 12 “popular courses” links, of which 8 pointed to 404 pages of discontinued courses.

After deleting these 8 links, crawl volume for pages with the navigation bar increased from 150 to 220 daily (crawler no longer wastes time on 404s).

Resolution Actions

  • Delete dead link internal links (applicable to invalid content);
  • Replace with valid links (such as change “old course” link to “latest course”);
  • If target page needs to be preserved, set 301 redirect (operate in server backend).

Orphan Pages

Orphan pages are pages with content but nointernal linkspointing to them (i.e., “Incoming Links=0”).​

Crawlers can only discover such pages through external links or direct URL input, and their indexing probability is 60% lower than pages with internal links​​.

Data and Operations

  • ​Common Types​​:
    • Temporary event pages (such as “Double 11 Sale” not deleted after ending);
    • Test pages (such as “new feature demo” not online);
    • Low-quality content pages (such as duplicate product parameter pages).
  • ​Detection Method​​:
    • Filter “Linked From=0” (no internal links) in「Indexability」report;
    • Or filter “Incoming Links=0” and “Word Count >100” in「Internal」report (content valuable but overlooked).

​Data​​: An e-commerce site discovered 200 orphan pages through this method (mainly old product detail pages), of which 80% still had search demand.

After supplementing internal links, thesepages’ indexing rateincreased from 15% to 70%.

Resolution Actions

  • Add internal links for high-value orphan pages (such as insert links in related category pages, popular articles);
  • Low-value orphan pages(such as test pages) directly delete or set robots.txt to block;
  • Regularly check new pages (such as screen after weekly crawl), to avoid new orphan pages being created.

Weight Concentration

Weight concentration refers to the homepage or a few core pages having too many links (such as 50 column links piled up in the footer navigation), causing the crawler to “spread its energy,”​​other important pages (such as product pages,blog articles) receive reduced crawl opportunities​​.

Data and Operations

  • ​Typical Manifestation​​: Homepage “Outgoing Links” exceeds 50 (ideal is 20-30);
  • ​Impact Quantification​​: A home goods website’s homepage had 68 links, core product page crawl depth changed from 2 layers (homepage → category page → product page) to 4 layers (needs to go through 3 intermediate pages), causing daily crawl volume to decrease by 40%.

Detection Method

  • Sort by “Outgoing Links” column in descending order in「Internal」report;
  • Focus on checking the number ofoutgoing linkson core pages such as homepage and category pages.

Resolution Actions

  • Streamline non-core links (such as move “Contact Us,” “About Us” to footer, keep only 5-8 core columns on homepage);
  • Move secondary links to “More” dropdown menu (reduce direct links on homepage);
  • Add internal links for core pages (such as popular products,high-conversion articles) (recommend in related content).

3 Tips for Batch Processing

  1. Use Excel to Filter High-Frequency Issues​​: After exporting internal link data, use “Data → Filter” function to quickly locate source pages that repeatedly appear (such as a navigation bar link pointing to 404 pages multiple times).
  2. ​Prioritize Processing Internal Links on High-Authority Pages​​: Internal links on homepage and category pages have the largest impact range, first fix dead links and weight concentration issues on these pages.
  3. Regular Review​​: Crawl once every two weeks with Screaming Frog, compare two sets of data (whether dead link count decreased, whether new orphan pages appeared), to ensure internal link structure remains healthy.

Finally, tools are only auxiliary—the core ofGoogle rankinghas always been “content users need”

Scroll to Top