Brand360

Web Fingerprint

26 technology, content and infrastructure detections

F1

CMS / Content Management System

What is it

CMS platform detection based on meta tags, HTML comments, URL structure, cookies, and platform-specific scripts. Recognizes WordPress, Joomla, Drupal, Ghost, Wix, Squarespace, Webflow, Typo3, Nette, Laravel, Django, HubSpot, Stranka.sk, Webnode, Blogger, and more.

Why it matters

The CMS is the foundation of web infrastructure — it determines security risks, performance, SEO capabilities, and maintenance costs. WordPress has different vulnerabilities than Webflow, and an e-shop on Shoptet requires different optimization than WooCommerce.

Real-world example

A site running WordPress 6.x has access to thousands of plugins but requires regular updates. A Webflow site is maintenance-free but less flexible. Fingerprint detects the CMS even when the operator has removed visible markers.

F2

E-commerce Platform

What is it

Identification of the e-commerce solution — Shoptet, PrestaShop, WooCommerce, Magento, OpenCart, Webareal, Shopify, Shoper, Upgates. Detection is performed via specific URL patterns, cart scripts, payment integrations, and meta tags.

Why it matters

The e-commerce platform directly affects conversion rate, product page loading speed, product SEO, and marketplace integrations. Each platform has specific limitations and optimization options.

Real-world example

Shoptet has native integration with Heureka.sk and Zbozi.cz, while WooCommerce requires plugins. Magento can handle millions of products but is more demanding on hosting. Fingerprint identifies both the platform and its version.

F3

JS / CSS Frameworks + CDN

What is it

Detection of JavaScript frameworks (jQuery, React, Vue.js, Angular, Alpine.js, HTMX, Turbo, Stimulus, Svelte) with versions, CSS frameworks (Bootstrap, Tailwind, Bulma, Foundation), and CDN providers (Cloudflare, CloudFront, Akamai, Fastly, jsDelivr).

Why it matters

The tech stack determines how modern, performant, and maintainable a website is. React 18 with Next.js is more performant than jQuery spaghetti code. The CDN provider affects latency and availability for end users.

Real-world example

A site using React 18 + Next.js + Tailwind CSS via Cloudflare CDN is modern and fast. A site with jQuery 1.x + Bootstrap 3 without a CDN is outdated and slow. Fingerprint also reveals versions, which helps identify security risks.

Verified sources

F4

Analytics & Marketing

What is it

Detection of analytics and marketing tools — Google Analytics (GA4, UA), GTM, Facebook Pixel, Hotjar, Heureka, Sklik, Criteo, Google Ads, SmartSupp, Biano, Luigi's Box, CookieYes, and more.

Why it matters

Analytics tools indicate the level of a company's digital maturity. A site without GA4 has no visitor data. The presence of remarketing pixels indicates active online marketing. CookieYes suggests GDPR compliance.

Real-world example

An e-shop with GA4 + GTM + Facebook Pixel + Heureka tracking has sophisticated analytics. A blog without any analytics has no visibility into traffic. Fingerprint also detects duplicate or conflicting tracking codes.

Verified sources

F5

Payment Gateways

What is it

Identification of payment gateways and methods — GoPay, Stripe, PayPal, Comgate, Tatrapay, Sporopay, CardPay, Cash on Delivery, Bank Transfer. Detection via JavaScript SDK, checkout URL patterns, and form elements.

Why it matters

Payment methods directly affect e-shop conversion rates. Customers expect card payments, bank transfers, and cash on delivery. Missing payment methods result in lost orders.

Real-world example

An e-shop with GoPay (card + bank transfer) + cash on delivery covers 90% of Slovak customers. A site with only PayPal loses customers who don't have a PayPal account. Stripe is preferred for international payments.

Verified sources

F6

Fonts

What is it

Detection of fonts in use — Google Fonts (with family extraction), Adobe Fonts (Typekit), Font Awesome, Custom WOFF/WOFF2. Analysis of the number of font families and their impact on performance.

Why it matters

Fonts are often the biggest render-blocking resource on a page. Each font family adds 50-200 KB to download. Too many fonts slow down LCP (Largest Contentful Paint) and degrade Core Web Vitals.

Real-world example

A site with 1-2 Google Fonts families has optimal loading. A site with 6+ different fonts and Font Awesome icons can have 500ms+ slower first render. Self-hosted WOFF2 fonts are faster than Google Fonts CDN.

Verified sources

F7

CDN Provider

What is it

CDN provider identification — Cloudflare, Fastly, Akamai, CloudFront, Google CDN. Detection via HTTP headers (cf-ray, x-cache, x-amz-cf-id), DNS records, and certificates.

Why it matters

A CDN dramatically reduces latency for end users. A site without a CDN serves content from a single server, resulting in higher latency for distant visitors. Cloudflare also provides DDoS protection and WAF.

Real-world example

A site behind Cloudflare has TTFB under 100ms even for visitors from other continents. A site on shared hosting without a CDN can have TTFB of 500ms+ for international visitors. A CDN also reduces load on the origin server.

Verified sources

F8

Hosting / Server Info

What is it

Web server and reverse proxy detection — Nginx, Apache, LiteSpeed, IIS, Tomcat + version + release year. Reverse proxy: Varnish, BigIP, HAProxy, Envoy, Traefik. Identification via Server header and specific headers.

Why it matters

Server software and its version affect performance and security. An outdated Apache version may contain known vulnerabilities. LiteSpeed is faster than Apache for PHP sites. A reverse proxy indicates enterprise infrastructure.

Real-world example

A site on Nginx 1.25 + Varnish cache has enterprise-grade infrastructure. A site on Apache 2.2 (EOL since 2018) is a security risk. Fingerprint reveals exact versions, which helps with security audits.

Verified sources

F9

Website Type Classification

What is it

Heuristic website classification based on detections — e-shop, marketplace, blog, forum, social network, aggregator, news portal, wiki, portfolio, catalog, booking, SaaS, streaming. Uses a combination of CMS, e-commerce platforms, and content.

Why it matters

The website type determines relevant metrics and benchmarks. An e-shop is evaluated differently than a blog — conversion rate vs. time on page. Classification enables comparison with relevant competitors in the same category.

Real-world example

A site with WooCommerce + product pages + a cart is classified as an e-shop. A site with WordPress + articles without products is a blog. A SaaS site has a login page, pricing, and documentation.

Verified sources

F10

Social Networks

What is it

Detection of social network links — Facebook, Instagram, Twitter/X, LinkedIn, YouTube, TikTok, Pinterest. URL extraction from footer links, meta tags (og:see_also), and JSON-LD.

Why it matters

Social media presence indicates a company's digital maturity and marketing strategy. A LinkedIn profile suggests B2B focus, TikTok suggests a younger target audience. Absence of social media may signal an inactive business.

Real-world example

A company with Facebook + Instagram + LinkedIn + YouTube has a comprehensive social presence. An e-shop with only a Facebook page is using a minimum of channels. Fingerprint extracts the exact URL for each platform.

Verified sources

C1

Visible Text Extraction

What is it

Removal of HTML tags, scripts, styles, and invisible elements — a clean text representation of the page. Used as input for keyword extraction, embeddings, and AI analysis.

Why it matters

Clean text is the foundation for all content analysis. AI models and search engines work with text, not HTML code. Quality extraction filters out navigational noise and preserves only content-relevant text.

Real-world example

From an e-shop HTML page, extraction removes the menu, footer, cookie banner and retains the product description, specifications, and reviews. This clean text is then used to generate embeddings and extract keywords.

Verified sources

C2

Word Count

What is it

A basic content length metric for the analyzed page. Counts words in the extracted visible text after removing HTML tags and scripts.

Why it matters

Content length correlates with information depth and SEO performance. Pages with fewer than 300 words are considered 'thin content'. AI models prefer more comprehensive sources when generating responses.

Real-world example

A product page with 50 words doesn't have enough information for SEO or AI. An article with 1500+ words has a greater chance of ranking in Google and being cited in AI responses. The optimal length depends on the page type.

Verified sources

KW1

Keywords — Extraction

What is it

Automatic keyword extraction from URL paths, H1, title, meta description, breadcrumbs, category tree, and headings. Scoring: weight x log2(frequency + 1) x log2(product_count + 2).

Why it matters

Keywords define the thematic focus of a website and are the foundation for both SEO and AI visibility. Automatic extraction reveals what topics a site actually focuses on — often different from what the owner believes.

Real-world example

An electronics e-shop has the strongest keywords 'mobile phone', 'notebook', 'tablet'. However, if 'sale' appears as the strongest word in the extraction, the site communicates discounts rather than products.

Verified sources

KW2

Keywords — Categorization

What is it

Classification of extracted keywords into categories — product, service, location, brand. Helps understand the thematic structure of a website and identify content gaps.

Why it matters

Keyword categorization shows whether a site covers all important aspects. An e-shop should have strong product keywords, a local business should have local ones. Gaps in categories indicate missing content.

Real-world example

A restaurant in Bratislava has strong product keywords ('pizza', 'pasta') but is missing local ones ('Bratislava', 'Old Town'). This means weak local SEO visibility and a low chance of appearing in AI responses to local queries.

Verified sources

SM1

Sitemap Existence

What is it

Check whether the site has an accessible sitemap.xml or sitemap index at standard URLs (/sitemap.xml, /sitemap_index.xml). Verification of HTTP status and XML format validity.

Why it matters

A sitemap is a map of the website for search engines and AI crawlers. Without a sitemap, crawlers must discover pages through links, which is slower and less reliable. Both Google and AI bots use sitemaps for efficient indexing.

Real-world example

An e-shop with 10,000 products without a sitemap risks Google not discovering 30-50% of product pages. A site with an up-to-date sitemap has all pages indexed within 48 hours of publication.

Verified sources

SM2

URL Count in Sitemap

What is it

Counting URLs in the sitemap — the basis for tier recommendation (FREE=1, BASIC=20, PRO=50+ URLs). Analysis of URL distribution across subdomains and sections.

Why it matters

The URL count determines the website's scope and the recommended audit tier. A small site with 5 URLs only needs a basic audit, while a large e-shop with thousands of products needs the PRO tier for a complete analysis.

Real-world example

A personal blog with 10 articles falls into the BASIC tier. An e-shop with 500 product pages needs the PRO tier to analyze all URLs. The number of URLs in the sitemap vs. the actual page count reveals indexing issues.

Verified sources

SM3

Sitemap Validity

What is it

Verification of sitemap XML format, URL correctness, and accessibility of linked pages. Checks lastmod dates, changefreq, and priority attributes.

Why it matters

An invalid sitemap can cause crawlers to ignore it. Incorrect URLs, missing namespaces, or invalid dates lead to indexing errors. Up-to-date lastmod dates help crawlers re-crawl efficiently.

Real-world example

A sitemap with URLs pointing to 404 pages signals a neglected website. A sitemap without lastmod dates doesn't allow crawlers to distinguish new from old content. A valid sitemap with current dates speeds up indexing.

Verified sources

SSL1

SSL Certificate — Existence

What is it

Verification that the domain uses HTTPS with a valid SSL/TLS certificate. Checks HTTP to HTTPS redirect and certificate validity for the given domain.

Why it matters

HTTPS has been a Google ranking requirement since 2018. Chrome and Firefox browsers display a 'Not Secure' warning for HTTP sites. SSL is essential for user trust and data protection in transit.

Real-world example

A site without SSL shows a red warning in the browser, immediately deterring visitors. An e-shop without HTTPS cannot accept card payments. All modern websites must have a valid SSL certificate.

Verified sources

SSL2

SSL Certificate — Issuer

What is it

Identification of the SSL certificate issuer — Let's Encrypt, DigiCert, Sectigo, GlobalSign, GeoTrust, and others. Certificate type: DV (Domain Validation), OV (Organization Validation), EV (Extended Validation).

Why it matters

The certificate type indicates the level of identity verification. DV (Let's Encrypt) only verifies domain ownership. OV and EV also verify the organization. For e-shops and financial services, an OV/EV certificate signals trustworthiness.

Real-world example

A bank with an EV certificate (DigiCert) has the highest level of verification. A blog with a Let's Encrypt DV certificate has basic encryption. Both are secure, but EV provides greater trust for sensitive transactions.

Verified sources

SSL3

SSL Certificate — Validity

What is it

Check of the SSL certificate expiration date and the number of days until expiry. Warning for certificates approaching expiration (less than 30 days).

Why it matters

An expired SSL certificate causes the browser to block access to the site with an error page. Automatic renewal (Let's Encrypt, Cloudflare) eliminates this risk. Manually managed certificates require monitoring.

Real-world example

A certificate with 340 days of validity is fine. A certificate with 5 days until expiration requires immediate renewal. Let's Encrypt certificates auto-renew every 90 days, while commercial certificates renew annually.

Verified sources

EMB1

Vector Embeddings

What is it

Generation of 1024-dimensional vector embeddings from extracted text using the BGE-M3 model via OpenRouter. Vectors are stored in a pgvector database for semantic search.

Why it matters

Vector embeddings enable semantic comparison of websites — not by keywords, but by content meaning. Two sites with different words but the same focus will have similar vectors.

Real-world example

An electronics e-shop and a tech blog about gadgets will have similar embeddings, even though they use different terminology. The cosine similarity between their vectors will be high (>0.8), signaling content relatedness.

Verified sources

EMB2

Competitor Similarity

What is it

Cosine similarity search in the embeddings database — finding the most content-similar websites in the Brand360 database. Result: TOP N closest domains with similarity percentage.

Why it matters

Automatically finding similar websites reveals competitors the owner may not have known about. It also helps benchmark the site against actual competition rather than subjective estimates.

Real-world example

A Slovak clothing e-shop gets a list of the 5 closest sites from the database — e.g., ZOOT.sk (92%), About You (88%), Answear.sk (85%). The owner thus discovers who they are actually competing with for customers online.

Verified sources

MOD1

Tech Stack Modernity Score

What is it

Technology modernity and best practices adoption rating. Output: tier (Legacy / Standard / Modern / Cutting-edge) + score 0-100. Composed of two equally weighted categories (50/50): Tech Stack (JS/CSS framework, build tool, HTTP version, CDN, web server) and Best Practices (image format, font loading, prefetch, lazy loading, CSP, security headers).

Why it matters

Tech stack modernity directly correlates with performance, security, and maintenance costs. Legacy technologies (jQuery + Bootstrap 3 + Apache) carry higher security risks and worse performance than modern ones (Next.js + Tailwind + Nginx).

Real-world example

A site with Next.js 14 + Tailwind + HTTP/3 + Cloudflare + WebP images scores 85/100 (Cutting-edge). A site with jQuery 1.x + Bootstrap 3 + Apache 2.2 + JPEG images scores 25/100 (Legacy). The score helps prioritize technical modernization.

Verified sources

Try auditing your website

Test your website against all checks and find out what to improve.

Start analysis