Getting StartedJune 8, 2026·9 min read

How to Score and Rank Your AI Automation Candidates (So You Pick the Right One)

Score each AI automation candidate on frequency, time, error cost, and feasibility, multiply into one number, rank the list, and back the winner.

Key Facts

Score each candidate workflow on four factors (frequency, time per run, cost of errors, and integration feasibility), multiply them into a single number, and rank the list from highest to lowest. The top score is your first automation. Use multiplication, not addition, so a workflow that fails badly on any one factor (especially feasibility) cannot win on volume alone. This matters because of the gen AI paradox: about 80% of companies use gen AI yet roughly the same share report no bottom-line impact, and the ones who get value back one decided workflow instead of spreading effort thin.

Mahmoud Zalt

Founder & AI Strategist · Sistava

To pick the right first automation, score every candidate workflow on four factors, frequency, time per run, cost of errors, and integration feasibility, rate each from 1 to 5, multiply the four numbers into one score, and rank the whole list from highest to lowest. The top score is the workflow to automate first. Multiply rather than add, so a workflow that fails badly on any single factor (especially feasibility) cannot win on raw volume. That one rule is the difference between a ranking that points at a workflow you can actually ship and a ranking that points at the one that will stall.

This guide gives you the rubric, a worked example, and a tie-break rule, so "pick your first workflow" becomes a number you can defend instead of a meeting you keep having. If you would rather we run this for you, see how we run AI strategy and an automation audit. Everything below is yours to use on your own.

Why does scoring beat a gut call?

Because adoption is everywhere and value is rare. McKinsey calls it the gen AI paradox: roughly 80% of companies have deployed generative AI, yet about the same share report no material impact on earnings. Around 90% of function-specific use cases never leave pilot, and fewer than 10% of organizations are scaling AI agents in any function. Choosing badly is the default outcome, not the exception.

That gap is a targeting problem, not a technology problem. The companies that capture value pick two or three high-value use cases and redesign the workflow around the agent instead of bolting it on. In McKinsey's data, workflow redesign is the single factor most correlated with bottom-line impact, and high performers are nearly 3x more likely to have fundamentally redesigned how the work runs. A score does not redesign anything, but it makes sure the workflow you redesign is the one worth the effort.

The upside when you choose well is large enough to take seriously. An IDC survey of more than 4,000 business leaders found an average return of $3.70 for every $1 invested in generative AI, rising to $10.30 per $1 for the top leaders. The distance between those two numbers is mostly about backing the right first workflow and running it properly, not about the model.

The standard automation guides all name the criteria for a good candidate, frequent, rule-based, manual, error-costly, contained, but they stop at criteria. None give you a way to weigh several real candidates against each other and walk away with one answer. This does.

What four factors should you score?

Four factors capture almost everything that decides whether a first workflow pays off. Rate each one from 1 (low) to 5 (high). The wording of each factor is set so that higher always means a better candidate.

Factor	The question it answers	Score 5 when	Score 1 when
Frequency	How often does this run?	Many times a day	Once a month or less
Time per run	How long does one run take by hand?	A slow, multi-step grind	A few seconds
Error cost	What does a mistake cost in money, time, or trust?	Expensive and painful to fix	Nobody notices a slip
Feasibility	How reachable are the tools, data, and steps?	Clean data, normal apps, fixed steps	Messy data, no API, steps that change

Each factor is doing distinct work, and together they cover the value side and the cost side of the decision:

Frequency and time per run together measure the prize. A workflow that runs fifty times a day and eats ten minutes each time is a different animal from one that runs weekly. This is why the playbooks all start with "audit the repetitive work": frequency times time is where the hours actually leak.
Error cost raises workflows where a mistake is expensive, not just slow. Asana names "the cost of errors or time is high enough to justify automating" as one of three traits of a good candidate.
Feasibility is the reality check the other three lack. A workflow can be frequent, slow, and error-costly and still be a terrible first project if its data is scattered across systems with no clean way in. Data quality is one of the most common reasons pilots die, so feasibility has to be in the score, not an afterthought.

How do you turn four factors into one ranking?

Multiply, do not add. For each candidate, compute:

Score = Frequency x Time x Error cost x Feasibility

The maximum is 5 x 5 x 5 x 5 = 625, the minimum is 1. Rank every candidate by this number, highest first. The top of the list is your first automation.

Multiplication matters because of how the factors interact. Adding lets a high frequency paper over a fatal weakness: a workflow scoring 5+5+5+1 adds to 16, which looks respectable, even though that 1 on feasibility means it will never ship. Multiplying, the same workflow scores 5 x 5 x 5 x 1 = 125, while a slightly less frequent but fully feasible workflow at 4 x 4 x 4 x 5 = 320 clears it easily. The product punishes any single weak factor, which is exactly the behavior you want, because in practice one weak factor (usually feasibility) is what kills a pilot.

This also surfaces the trap most teams fall into. The most exciting workflow is often the one with the lowest feasibility, while a boring, high-frequency, app-to-app bridge quietly posts the highest product. Trust the number over the excitement.

Prefer to run it yourself? You can Hire AI Agents and put one to work on your top-scoring workflow today.

What does the ranking look like worked through?

Here is the rubric applied to five candidates a typical small business might list after auditing a week. Each factor is scored 1 to 5, and the final column is the product.

Candidate workflow	Frequency	Time	Error cost	Feasibility	Score
Copy new leads from inbox into the CRM and assign an owner	5	3	4	5	300
Re-key invoice details from email into accounting	4	4	5	4	320
Draft answers to the same five support questions	5	3	3	4	180
Rebuild the weekly management report	1	5	4	3	60
Price a custom enterprise deal	2	5	5	1	50

Read the ranking from the top. Invoice re-keying (320) and lead capture (300) are the two clear winners, and both are textbook app-to-app bridges, the spot where a person reads data out of one app and types it into another. The Zapier playbook calls that bridge the single best heuristic for a first automation, and the scores agree: both land high on frequency and feasibility at once. McKinsey Global Institute estimates 69% of data processing and 64% of data collection work is technically automatable, and these bridges are exactly that work.

Now read the bottom. The weekly report (60) is slow and error-costly but rare, so its low frequency caps it. The custom-deal pricing (50) scores 1 on feasibility because it is judgment work with steps that change every time, and the multiplication is merciless about that. Both are real work. Neither is a good first project. The rubric tells you that without an argument.

Notice what the math did: support-question drafting (180) feels like an obvious AI use case, yet it ranks third behind two duller bridges because its error cost and feasibility are middling. That is the rubric earning its keep, demoting the exciting candidate and promoting the boring one that will actually ship.

How do you break a tie at the top?

Invoice re-keying at 320 and lead capture at 300 are close enough that, with slightly different judgment on a single factor, they could swap. When two candidates land within a few points of each other, break the tie with this order:

Feasibility first. The more reachable workflow ships faster and proves ROI sooner. A pilot that goes live in two weeks beats a marginally higher score that takes two months to wire up.
Containment second. Favor the workflow whose output you can review in isolation, without touching the rest of the business. HubSpot's advice is to automate one workflow that is high-impact but contained, precisely so you can pilot it safely.
Measurability third. Favor the win that is easiest to put a number on, hours saved or money recovered, because a clear before-and-after is what funds the next workflow.

Speed to a proven result beats a marginally higher score. The goal of the whole exercise is one decided, shippable workflow, not the theoretically perfect one.

A few guardrails keep the scoring honest:

Score the work as it runs today, not as you wish it ran. If the data is messy now, feasibility is low now. Scoring the fantasy version is how teams talk themselves into a stalled pilot.
Let a 1 on feasibility veto. If a candidate scores 1 on feasibility, it is not a first project no matter how it looks elsewhere. Fix the data or the access first, then re-score.
Aim for the sweet spot, not the hardest problem. The best first workflow removes the grind without removing the judgment that still needs a person. A low rule-based score (which shows up as low feasibility here) is a signal to keep the human and automate around them, not to force it.

How do you turn the winning score into a working automation?

The score tells you what to automate first. It does not build it. Once you have your top scorer, the path is short and the same every time:

Map the current workflow. Write down who touches it, what triggers it, what information they read, what decision they make, and what they produce. If you cannot describe it clearly enough for a new hire to follow, an agent cannot follow it either. This map defines the trigger, conditions, and actions, and it is your proof later of what "working" means.
Decide where a human checkpoint belongs. Use the factor the workflow scored lowest on. If error cost is high, that is where the agent escalates instead of guessing.
Run one pilot and measure it. Run the new workflow alongside the old one for a short period and compare against the map. Pick the metric up front: hours saved, records processed without errors, faster follow-up. One concrete payoff to watch for is response speed, since instant automated responses make a follow-up about 42% more likely to land.
Prove the return, then reuse the rubric. Once the pilot clears the bar, you have a template. Re-score your remaining candidates with the same four factors, funded by the savings from the first, and back the next winner.

This last part is where most do-it-yourself pilots stall, and it is exactly why around 90% of function-specific use cases never leave pilot. The scoring is the easy half. The build, the integration, the monitoring, and the iteration are the half that keeps a workflow out of the pilot graveyard, and they are the half McKinsey's data points to when it says workflow redesign, not a number on a spreadsheet, is what moves EBIT.

How to get started

You do not need a transformation program to begin. This week, audit your time, list every repetitive manual task, score each one on frequency, time, error cost, and feasibility, multiply, and rank. Back the top scorer, break any tie on feasibility, map it, pilot it end to end, and measure it. Then reuse the rubric on the next candidate. That is the whole method, and it is the difference between the companies that get $3.70 back per dollar and the leaders that get $10.30.

If you want the fastest path, you can skip the trial and error. We do the audit, score your candidate workflows with you, rank them down to the single highest-ROI first automation, then plan, build, integrate, and run the agent, including the monitoring and iteration that keep it out of the pilot graveyard. Book a free consultation below and we will rank your candidates and decide your first automation together.

Want us to score your candidates and build the winner?

We plan, build, and run the AI agents inside your business, starting with the one workflow that scores highest. Book a free consultation.

Book your free consultation

Frequently Asked Questions

01How do you score AI automation candidates?+

Rate each candidate workflow from 1 to 5 on four factors: how often it runs (frequency), how long each run takes by hand (time), how costly its errors are (error cost), and how reachable the tools and data are (feasibility). Multiply the four numbers into one score, then rank every candidate by that score. The highest total is the workflow to automate first.

02Why multiply the scores instead of adding them?+

Multiplication makes any single weak factor drag the whole score down, which is what you want. A workflow that runs constantly but sits on messy, unreachable data scores near zero on feasibility, so its product collapses and it cannot win on volume alone. Adding would let a high frequency hide a fatal feasibility problem, which is exactly how pilots stall.

03What is the best AI automation candidate to pick first?+

The one with the highest frequency times time times error cost times feasibility score. In practice that is almost always a frequent, rule-based, manual workflow where a person bridges two apps by hand, because it scores high on frequency and feasibility at once. Pick the single highest scorer, not the most exciting idea.

04How do I break a tie between two high-scoring workflows?+

Break ties on feasibility first, since the more reachable workflow ships faster and proves ROI sooner. If feasibility is equal, favor the one with the cleaner contained output you can review in isolation, then the one whose win is easiest to measure in hours or money. Speed to a proven result beats a marginally higher score.

05Does a high score guarantee the automation will succeed?+

No. The score tells you which workflow is worth backing first, but McKinsey's data shows workflow redesign, not a number on a spreadsheet, is the factor most correlated with bottom-line impact. The build, integration, monitoring, and iteration are where most do-it-yourself pilots stall. Score to choose, then redesign and run to win.

Related Insights

Getting Started

What Should You Automate First With AI? A 5-Step Way to Decide in 2026

What should you automate first with AI? Audit your week, find the manual app-to-app bridges, and score one high-ROI workflow. A 5-step way to decide.

Read article

Getting Started

Is This Workflow a Good First Automation? The 7-Point Checklist

Is this workflow a good first AI automation? Score it on 7 yes/no points: frequent, rule-based, manual, high-volume, error-costly, app-bridging, contained.

Read article

Want us to score your candidates and build the winner?

We plan, build, and run the AI agents inside your business, starting with the one workflow that scores highest. Book a free consultation.

Book your free consultation All Insights

How to Score and Rank Your AI Automation Candidates (So You Pick the Right One)

Why does scoring beat a gut call?

What four factors should you score?

How do you turn four factors into one ranking?

What does the ranking look like worked through?

How do you break a tie at the top?

How do you turn the winning score into a working automation?

How to get started

Want us to score your candidates and build the winner?

Frequently Asked Questions

Related Insights

What Should You Automate First With AI? A 5-Step Way to Decide in 2026

Is This Workflow a Good First Automation? The 7-Point Checklist

Want us to score your candidates and build the winner?

Innovaciones

Recursos

Empresa