To pick the right first automation, score every candidate workflow on four factors, frequency, time per run, cost of errors, and integration feasibility, rate each from 1 to 5, multiply the four numbers into one score, and rank the whole list from highest to lowest. The top score is the workflow to automate first. Multiply rather than add, so a workflow that fails badly on any single factor (especially feasibility) cannot win on raw volume. That one rule is the difference between a ranking that points at a workflow you can actually ship and a ranking that points at the one that will stall.

This guide gives you the rubric, a worked example, and a tie-break rule, so "pick your first workflow" becomes a number you can defend instead of a meeting you keep having. If you would rather we run this for you, see how we run AI strategy and an automation audit. Everything below is yours to use on your own.

Why does scoring beat a gut call?

Because adoption is everywhere and value is rare. McKinsey calls it the gen AI paradox: roughly 80% of companies have deployed generative AI, yet about the same share report no material impact on earnings. Around 90% of function-specific use cases never leave pilot, and fewer than 10% of organizations are scaling AI agents in any function. Choosing badly is the default outcome, not the exception.

That gap is a targeting problem, not a technology problem. The companies that capture value pick two or three high-value use cases and redesign the workflow around the agent instead of bolting it on. In McKinsey's data, workflow redesign is the single factor most correlated with bottom-line impact, and high performers are nearly 3x more likely to have fundamentally redesigned how the work runs. A score does not redesign anything, but it makes sure the workflow you redesign is the one worth the effort.

The upside when you choose well is large enough to take seriously. An IDC survey of more than 4,000 business leaders found an average return of $3.70 for every $1 invested in generative AI, rising to $10.30 per $1 for the top leaders. The distance between those two numbers is mostly about backing the right first workflow and running it properly, not about the model.

The standard automation guides all name the criteria for a good candidate, frequent, rule-based, manual, error-costly, contained, but they stop at criteria. None give you a way to weigh several real candidates against each other and walk away with one answer. This does.

What four factors should you score?

Four factors capture almost everything that decides whether a first workflow pays off. Rate each one from 1 (low) to 5 (high). The wording of each factor is set so that higher always means a better candidate.

FactorThe question it answersScore 5 whenScore 1 when
FrequencyHow often does this run?Many times a dayOnce a month or less
Time per runHow long does one run take by hand?A slow, multi-step grindA few seconds
Error costWhat does a mistake cost in money, time, or trust?Expensive and painful to fixNobody notices a slip
FeasibilityHow reachable are the tools, data, and steps?Clean data, normal apps, fixed stepsMessy data, no API, steps that change

Each factor is doing distinct work, and together they cover the value side and the cost side of the decision:

  • Frequency and time per run together measure the prize. A workflow that runs fifty times a day and eats ten minutes each time is a different animal from one that runs weekly. This is why the playbooks all start with "audit the repetitive work": frequency times time is where the hours actually leak.
  • Error cost raises workflows where a mistake is expensive, not just slow. Asana names "the cost of errors or time is high enough to justify automating" as one of three traits of a good candidate.
  • Feasibility is the reality check the other three lack. A workflow can be frequent, slow, and error-costly and still be a terrible first project if its data is scattered across systems with no clean way in. Data quality is one of the most common reasons pilots die, so feasibility has to be in the score, not an afterthought.

How do you turn four factors into one ranking?

Multiply, do not add. For each candidate, compute:

Score = Frequency x Time x Error cost x Feasibility

The maximum is 5 x 5 x 5 x 5 = 625, the minimum is 1. Rank every candidate by this number, highest first. The top of the list is your first automation.

Multiplication matters because of how the factors interact. Adding lets a high frequency paper over a fatal weakness: a workflow scoring 5+5+5+1 adds to 16, which looks respectable, even though that 1 on feasibility means it will never ship. Multiplying, the same workflow scores 5 x 5 x 5 x 1 = 125, while a slightly less frequent but fully feasible workflow at 4 x 4 x 4 x 5 = 320 clears it easily. The product punishes any single weak factor, which is exactly the behavior you want, because in practice one weak factor (usually feasibility) is what kills a pilot.

This also surfaces the trap most teams fall into. The most exciting workflow is often the one with the lowest feasibility, while a boring, high-frequency, app-to-app bridge quietly posts the highest product. Trust the number over the excitement.

Prefer to run it yourself? You can Hire AI Agents and put one to work on your top-scoring workflow today.

What does the ranking look like worked through?

Here is the rubric applied to five candidates a typical small business might list after auditing a week. Each factor is scored 1 to 5, and the final column is the product.

Candidate workflowFrequencyTimeError costFeasibilityScore
Copy new leads from inbox into the CRM and assign an owner5345300
Re-key invoice details from email into accounting4454320
Draft answers to the same five support questions5334180
Rebuild the weekly management report154360
Price a custom enterprise deal255150

Read the ranking from the top. Invoice re-keying (320) and lead capture (300) are the two clear winners, and both are textbook app-to-app bridges, the spot where a person reads data out of one app and types it into another. The Zapier playbook calls that bridge the single best heuristic for a first automation, and the scores agree: both land high on frequency and feasibility at once. McKinsey Global Institute estimates 69% of data processing and 64% of data collection work is technically automatable, and these bridges are exactly that work.

Now read the bottom. The weekly report (60) is slow and error-costly but rare, so its low frequency caps it. The custom-deal pricing (50) scores 1 on feasibility because it is judgment work with steps that change every time, and the multiplication is merciless about that. Both are real work. Neither is a good first project. The rubric tells you that without an argument.

Notice what the math did: support-question drafting (180) feels like an obvious AI use case, yet it ranks third behind two duller bridges because its error cost and feasibility are middling. That is the rubric earning its keep, demoting the exciting candidate and promoting the boring one that will actually ship.

How do you break a tie at the top?

Invoice re-keying at 320 and lead capture at 300 are close enough that, with slightly different judgment on a single factor, they could swap. When two candidates land within a few points of each other, break the tie with this order:

  1. Feasibility first. The more reachable workflow ships faster and proves ROI sooner. A pilot that goes live in two weeks beats a marginally higher score that takes two months to wire up.
  2. Containment second. Favor the workflow whose output you can review in isolation, without touching the rest of the business. HubSpot's advice is to automate one workflow that is high-impact but contained, precisely so you can pilot it safely.
  3. Measurability third. Favor the win that is easiest to put a number on, hours saved or money recovered, because a clear before-and-after is what funds the next workflow.

Speed to a proven result beats a marginally higher score. The goal of the whole exercise is one decided, shippable workflow, not the theoretically perfect one.

A few guardrails keep the scoring honest:

  • Score the work as it runs today, not as you wish it ran. If the data is messy now, feasibility is low now. Scoring the fantasy version is how teams talk themselves into a stalled pilot.
  • Let a 1 on feasibility veto. If a candidate scores 1 on feasibility, it is not a first project no matter how it looks elsewhere. Fix the data or the access first, then re-score.
  • Aim for the sweet spot, not the hardest problem. The best first workflow removes the grind without removing the judgment that still needs a person. A low rule-based score (which shows up as low feasibility here) is a signal to keep the human and automate around them, not to force it.

How do you turn the winning score into a working automation?

The score tells you what to automate first. It does not build it. Once you have your top scorer, the path is short and the same every time:

  1. Map the current workflow. Write down who touches it, what triggers it, what information they read, what decision they make, and what they produce. If you cannot describe it clearly enough for a new hire to follow, an agent cannot follow it either. This map defines the trigger, conditions, and actions, and it is your proof later of what "working" means.
  2. Decide where a human checkpoint belongs. Use the factor the workflow scored lowest on. If error cost is high, that is where the agent escalates instead of guessing.
  3. Run one pilot and measure it. Run the new workflow alongside the old one for a short period and compare against the map. Pick the metric up front: hours saved, records processed without errors, faster follow-up. One concrete payoff to watch for is response speed, since instant automated responses make a follow-up about 42% more likely to land.
  4. Prove the return, then reuse the rubric. Once the pilot clears the bar, you have a template. Re-score your remaining candidates with the same four factors, funded by the savings from the first, and back the next winner.

This last part is where most do-it-yourself pilots stall, and it is exactly why around 90% of function-specific use cases never leave pilot. The scoring is the easy half. The build, the integration, the monitoring, and the iteration are the half that keeps a workflow out of the pilot graveyard, and they are the half McKinsey's data points to when it says workflow redesign, not a number on a spreadsheet, is what moves EBIT.

How to get started

You do not need a transformation program to begin. This week, audit your time, list every repetitive manual task, score each one on frequency, time, error cost, and feasibility, multiply, and rank. Back the top scorer, break any tie on feasibility, map it, pilot it end to end, and measure it. Then reuse the rubric on the next candidate. That is the whole method, and it is the difference between the companies that get $3.70 back per dollar and the leaders that get $10.30.

If you want the fastest path, you can skip the trial and error. We do the audit, score your candidate workflows with you, rank them down to the single highest-ROI first automation, then plan, build, integrate, and run the agent, including the monitoring and iteration that keep it out of the pilot graveyard. Book a free consultation below and we will rank your candidates and decide your first automation together.