The ‘Content’ tab also includes a ‘Low Content Pages’ filter, which identifies pages with less than 200 words using the improved word count. The near-duplicate content threshold and content area used in the analysis can both be updated post-crawl, and crawl analysis can be re-run to refine the results, without the need for re-crawling. This displays every near-duplicate URL identified, and their similarity match.Ĭlicking on a ‘Near Duplicate Address’ in the ‘Duplicate Details’ tab will display the near duplicate content discovered between the pages, and perform a diff to highlight the differences. Near duplicates requires post crawl analysis to be populated, and more detail on the duplicates can be seen in the new ‘Duplicate Details’ lower tab. It can also be used to provide a more accurate word count. This can help focus the analysis on the main content area, avoiding known boilerplate text. Semantic elements such as the nav and footer are automatically excluded from the content analysis, but you can refine it further by excluding or including HTML elements, classes and IDs. This can be configured via ‘Config > Content > Duplicates’. The new ‘Near Duplicates’ detection uses a minhash algorithm, which allows you to configure a near-duplicate similarity threshold, which is set at 90% by default. The ‘Exact Duplicates’ filter uses the same algorithmic check for identifying identical pages that was previously named ‘Duplicate’ under the ‘URL’ tab. Very similar pages should be minimised and high similarity could be a sign of low-quality pages, which haven’t received much love – or just shouldn’t be separate pages in the first place.įor ‘Near Duplicates’, the SEO Spider will show you the closest similarity match %, as well as the number of near-duplicates for each URL. While there isn’t a duplicate content penalty, having similar pages can cause cannibalisation issues and crawling and indexing inefficiencies. We’ve introduced a new ‘ Content‘ tab, which includes filters for both ‘Near Duplicates’ and ‘Exact Duplicates’. You can now discover near-duplicate pages, not just exact duplicates. We’ve been busy developing exciting new features, and despite the obvious change in priorities for everyone right now, we want to continue to release updates as normal that help users in the work they do. We are excited to announce the release of Screaming Frog SEO Spider version 13.0, codenamed internally as ‘Lockdown’.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |