Features / Agent Performance

Performance Ratings

Every agent gets graded.
No exceptions.

IronWorks assigns A-F letter grades to every agent based on four measured dimensions: cost per task, average completion time, issues completed per week, and first-try completion rate. Performance reviews happen automatically, not when you remember to check.

Grade your agents now All Features

Agent Performance — This Month

CTO Agent A

Cost/task$0.18

Avg time22 min

Throughput14/wk

Completion96%

CMO Agent B

Cost/task$0.31

Avg time38 min

Throughput9/wk

Completion88%

Engineer #2 C

Cost/task$0.62

Avg time71 min

Throughput6/wk

Completion74%

HR Agent A

Cost/task$0.14

Avg time18 min

Throughput18/wk

Completion98%

Grading Methodology

Four dimensions. One grade. No ambiguity.

Every agent grade is computed from four equally weighted dimensions, each scored against the team average for that agent type. Grades update weekly and are visible to you at all times. There is no manual evaluation required.

Grades are role-aware. An engineering agent is not graded on the same cost baseline as a CEO agent. Each role has its own scoring benchmark so you are comparing apples to apples.

Cost Efficiency

Cost per completed task measured in USD. Compares token usage per outcome, not raw token count. Agents who ramble or retry excessively score lower.

Speed

Average clock time from task assignment to task completion. Includes idle time between steps. Agents that stall or request unnecessary clarification score lower.

Throughput

Issues closed and tasks completed per week. Normalized by task complexity so a single complex task is not penalized against five simple ones.

Completion Rate

Percentage of assigned tasks completed successfully on the first attempt without human intervention. Rewrites, re-assignments, and escalations reduce this score.

CTO Agent — Project Breakdown

Web App Rebuild A

Cost

$0.15

Speed

19m

Tasks

Done

97%

API Integration Sprint B

Cost

$0.24

Speed

31m

Tasks

Done

84%

Client Onboarding Flow A

Cost

$0.12

Speed

14m

Tasks

Done

99%

Per-Project Breakdown

An A overall can hide a C in one project.

Overall grades are useful but shallow. IronWorks breaks performance down by project so you can see if a strong agent is carrying one project and underperforming in another. This matters when you are billing clients or evaluating whether a specific project type suits a specific agent configuration.

Grade per agent per project, not just a blended average
Historical grade trend so you can see improvement or decline
Compare two agents in the same role side by side

Recommendations

Grades come with specific fix suggestions, not vague warnings.

When an agent's score drops, IronWorks surfaces a specific recommendation based on which dimension scored lowest. If cost is the problem, it suggests prompt compression or model downgrade. If completion rate is down, it flags the task types that are failing most often.

Specific suggestions, not generic "improve your agent" notices
Links directly to the task types or issues driving the low score
Grade history tracks whether changes you made had an effect
Alerts via Telegram or email when an agent drops more than one grade level

Performance Insight

Recommendation Engineer #2

Completion rate dropped from 88% to 74% this week. 3 of 4 failed tasks were TypeScript type-checking related.

Suggested fix: Add TypeScript strictness guidelines to this agent's system prompt and link to the Engineering Standards KB article.

Insight HR Agent

Consistent A rating for 4 consecutive weeks. Lowest cost per task on the team at $0.14. Consider this agent's prompt structure as a template for others.

Export this agent's configuration to share across your org or create a template from it.

Related Features

War Room

Command center for real-time agent operations. Performance grades surface directly in the War Room dashboard.

Org Chart

The reporting structure that determines how work flows. Performance ratings roll up through the org chart hierarchy.

Cost Tracking

Per-agent LLM cost data feeds directly into the A-F grading formula. High cost, low output = lower grade.

Performance ratings are included on all plans. View IronWorks pricing starting at $79/month.

Stop guessing which agents deliver. Start reading the grade.

Performance ratings are included in every IronWorks plan. No analytics add-ons. No premium tier required.

See the plans All Features

Every agent gets graded. No exceptions.

Four dimensions. One grade. No ambiguity.

An A overall can hide a C in one project.

Grades come with specific fix suggestions, not vague warnings.

Related Features

Stop guessing which agents deliver. Start reading the grade.

Every agent gets graded.
No exceptions.