Building an AI-Assisted Positioning Audit Tool: Design, Problems & Insights

Building an AI-Assisted Positioning Audit Tool: Design, Problems & Insights

This post is a follow up to an earlier piece on [what the Odyssey positioning audit measures and what it found across 16 Irish construction companies]. If you are an SME owner thinking about your positioning, you can start there.

In this post is more about how I built the tool, the thinking process and the challenges I encountered as I built. If you build tools, work with AI in a professional services context, or want to understand what I did, this is the account you are looking for.

The short version: the tool runs across seven scoring dimensions, uses a four-point scale, draws from two data sources, makes seven parallel AI calls per audit, and puts a consultant in the loop before anything reaches a client. What follows explains why each of those choices was made and what building it actually involved.

The Brief for the Tool

Before taking on a client engagement, I need an outside view of how well a company communicates its positioning. A conversation with the client will not give me that. They will always describe themselves through the lens of what they intend to say rather than what a prospective buyer actually receives.

I originally built the tool as a way to easily compare companies in the same industry from the perspective of a potential client without prior knowledge of any of them, but it naturally evolved into an audit tool that goes deep into the online presence of a company; It looks at the most basic channels of communication: the website and the LinkedIn profile. It then scores the content and forms a structured view of how clearly the company communicates who it is, who it serves, and why anyone should choose it. 

Two use cases drove the design from the start. The first is the client audit: a diagnostic report produced at the beginning of an engagement, used to show the client where their messaging gaps actually are. The second is the sector report: a research piece produced by scoring a group of companies in the same industry. One tool, two outputs.

Seven Dimensions and Two Layers

The hardest design decision was the scoring framework. It needed to produce differentiated results across companies, apply across sectors, and reflect how positioning problems actually present in practice.

The framework is organised around a distinction that matters commercially: there is a difference between what a company is trying to say and how well they are actually saying it. These are critical factors in the communication strategy of a company and can fail independently from each one or in tandem, thus compounding the marketing problem of the company.

Layer One is called Strategic Signal. It covers the three things that must be true before anything else matters.

  • ICP Signal Clarity: Does the company know who it is talking to, and does that show in its communications?
  • Value Proposition: Is the value of what they offer stated clearly and specifically?
  • Differentiation: Can a prospective buyer articulate why they should choose this firm rather than a competitor?

Layer Two is called Communication Execution. It covers how well the message is actually delivered.

  • Messaging Consistency: Is the same story being told across all channels?
  • Tone of Voice: Is there a recognisable, distinctive voice in the content?
  • Proof and Credibility: Is there real evidence to support the claims being made?
  • Content Coherence: Does the body of content add up to a coherent picture of who this company is?

Each dimension has three sub-dimensions, giving 21 data points per audit. That granularity was deliberate. A single score for "value proposition" is not useful to a client. "Your clarity is strong but your benefit specificity is underdeveloped and your channel distribution is weak" gives them something to act on.

The Scoring System

Scoring runs on a four-point scale: 0, 3, 7, and 10. Those are the only valid values.

A ten-point scale sounds more precise, but in real terms, what does a score of 7 tells you over a score of 6? We are not assessing precise facts or figures. The four points map to four named maturity levels: Absent, Partial, Present, and Strong. The gaps between values are deliberately unequal. Moving from Absent to Partial is a different category of improvement from moving from Present to Strong.

There are two constraints or criteria that are incorporated into the scoring.

The "recency" cap ensures scores reflect what a prospect would encounter today. Content published more than twelve months ago is tagged Dated and capped at Partial (3). Content between three and twelve months old is tagged Recent and capped at Present (7). Only content from the last three months can score Strong (10). A well-written About page from three years ago that has not been touched does not serve a prospect forming an impression today.

The tier cap ensures scores reflect what a prospect would actually find. Tier 1 pages are those a prospect reaches in the first five minutes: the homepage, About, Services, Contact, and recent LinkedIn posts. These carry full scoring weight. Tier 2 pages, including blog posts and deep sub-pages, are capped at Partial regardless of quality. If your differentiation claim only lives in a 2022 blog post, it is not functioning as differentiation in your prospect's experience.

The Data Architecture

The tool draws from two sources: the company website and its LinkedIn presence. Both are collected using Apify, a commercial web scraping platform that handles JavaScript-rendered pages and works within LinkedIn's access restrictions.

For the website, Apify's content crawler visits up to 25 pages, renders them fully, extracts readable text, and returns metadata including publication dates. Each page is classified by tier and recency before scoring begins.

For LinkedIn, a third-party actor retrieves the company's most recent posts. Getting this working took three separate debugging cycles. The actor's name changed partway through development. The input format differed from what the documentation described. The output structure was entirely different from what was assumed: instead of a company-level object with posts nested inside, each item in the dataset was a standalone post.

How the Scoring Works

Once data is gathered, the tool makes seven parallel AI calls, one for each dimension. Each call receives focused context: the website pages most relevant to that dimension, the LinkedIn posts, and the company's LinkedIn description.

The prompt for each dimension includes the full rubric with detailed anchors for what Absent, Partial, Present, and Strong look like for that specific dimension. The model is instructed to return only valid JSON with scores of exactly 0, 3, 7, or 10. It also returns sub-dimension reasoning, evidence quotes pulled directly from the content, and a confidence rating per score.

Running seven calls in parallel rather than sequentially reduces total scoring time significantly. On the first full test run, all seven dimensions were completed in under a minute.

The Verification Layer

The tool does not publish scores directly. The model's output is a first draft, not a final report.

Before any report reaches a client, a consultant reviews each dimension, can adjust individual sub-dimension scores, and adds notes explaining the reasoning behind any change. This matters for two reasons. First, the model can misread context. A company in a specialised niche might score low on ICP clarity because the niche is implicit to everyone in the industry but invisible to an outside reader. A consultant who knows the sector can correct for that. Second, the consultant's judgement is part of what the client is paying for. The tool accelerates the analysis. A human remains responsible for the conclusions.

The verification interface is a browser-based application built with Streamlit. It lists completed audit runs, shows dimension scores with colour-coded maturity indicators, and allows score adjustments per sub-dimension. Any change recalculates the dimension and overall scores in real time. When the consultant is satisfied, they mark the audit as verified, which triggers final report generation.

The Report

The output is a self-contained HTML file, readable in any browser and printable to PDF. The design requirement was that a client should be able to read it without any accompanying explanation.

It contains a header with the company name and overall score, a colour-coded bar chart showing all seven dimension scores, and the full dimension analysis grouped by layer. Each dimension card shows the consultant's reasoning, a sub-dimension breakdown table, and evidence quotes with recency labels. An expandable section explains the maturity levels, the tier system, and the recency caps, because a report is only useful if the reader understands what the scores mean.

Lessons Learnt

  1. It’s easy to build a company comparison tool and it will give you adequate results given that all the information is available and easy to verify. However, and this is a big one. You will not derive clear insights from this first attempt, not the type that you could justify in front of a well seasoned marketer.
  2. In order to build a proper positioning tool, implicit knowledge about the practice is not enough. You need to pull out all the knowledge you have inside you and put it into a framework that most of us in marketing always assume is there but we never see it.
  3. If you want a  professional looking report, be ready to spend days and days talking to the LLM until you get the proper margins, logos, font size and positioning inside each page.
  4. There is a lot of potential to create a core product that works well across all industry sectors. Furthermore (I know, sounds like AI right?), there is a myriad of add ons that could be built in order to create a really robust tool that can help marketers and companies in general test their marketing effectiveness.

Frequently Asked Questions

How long does a full audit take to run? The automated scoring step completes in under a minute. The consultant verification and report review typically takes one to two hours depending on the volume of content available.

Can the tool be used for sectors other than construction? Yes. The seven dimensions apply across sectors. The construction sample was a test case, and the framework produced consistent, differentiated results across firms with very different specialisms. Sector-specific context is something the consultant adds during verification, not something baked into the scoring logic.

What AI model powers the scoring? The tool currently uses Claude via the Anthropic API. The prompts are structured to return only valid JSON and include strict scoring constraints to prevent the model from returning values outside the four-point scale.

About This Post

Octavio Hernández is a marketing consultant and coach working with SME founders and commercial leaders on positioning, B2B lead generation, and marketing strategy.

The positioning audit tool described in this post was built and tested between April and May 2026. The sector data referenced covers 16 Irish construction companies, assessed using publicly available website and LinkedIn content. Individual company results are anonymised.

Want to learn more, or talk about your positioning needs?

Book a call now.

Scroll to Top