Dalius's blog

RSS

Tuesday, June 9, 2026

Taskey: task management for agents

I have tried to find simple task management tool for agentic coding and none was simple enough. I have checked about 10 different ones and they are either too complicated, doing too much or not suitable for agentic coding. Beads most probably matched the most what I need, but I found it too aggressive and I do not like that it commits content to git repository.

Solution

Taskey - https://github.com/daliusd/taskey

It is zero config task management for agentic development. It allows to create task list per git worktree (usually agent creates tasks itself, but can be done by human too) and implement tasks one-by-one (or whatever way you want to).

Installation:

npm install -g @daliusd/taskey
npx skills add https://github.com/daliusd/taskey --skill taskey

Usage:

  • Ask to “create taskey tasks” when planning new feature. E.g. like this: step 1

  • You have taskey command line util which shows available commands when run without arguments: step 2

  • taskey list will show currently planned tasks for you git worktree: step 3

  • taskey next will show information about next command step 4

  • In Claude Code / Codex CLI / pi agent you can use for example /taskey next to implement next available task. You can do this in new session and it should have enough information to implement what task is describing.

  • Use taskey delete-all in your git worktree if you want to throw away all the tasks (e.g. you have decided that your experiment has not worked and there is no need to continue with implementation).

Problem

Usually 200k tokens context window is enough to implement many tasks from start till end, but problems start when you have task that needs more than 200k. There are multiple ways to address this:

  • models that have context window larger than 200k. Problem with this is that model become less smart after 100k tokens (search for “Matt Pocock Context Cliff”). I have experienced this with Opus after 200k tokens.

  • (Auto)Compaction. The main problem here is that compaction is surprisingly expensive, but overall it works.

  • Tasks. Claude Code has tasks support, there are beads, beans, simple markdown file with checklist and many more options. The idea here is to split your big task into many smaller ones and handle them one-by-one (Matt Pocock’s Sandcastle actually allows to implement tasks that do not have prerequisites in parallel in different git branches).

Taskey falls into the last category. Let’s say you have feature that touches one library, back end and front end. You can implement each part separately, but back end depends on library and front end depends on back end. Taskey allows to record what tasks must be done and relation between them. That’s it.

Cost

This one is complicated to estimate, but I have tried. My assumptions:

  • Each next user interaction sends all previous messages and there is caching with 90% discount. This is how Claude and Codex models work.

  • Each round produces 750 output tokens on average (e.g. agent asks to run several tool calls or user prompts something). I have asked AI to write tool to estimate this number from my Claude Code, Codex CLI and pi-agent sessions.

  • Harness (system prompt + tools information) adds about 25k tokens overhead on each new session.

  • If we have large task that can be split into multiple smaller then I assume that context usage between multiple smaller tasks is the same as of one large (harness excluded). E.g. 445k tokens one large task (420k task + 25k harness), or 5 109k small tasks (84k per task + 25k harness for each new session per task).

Estimation has shown that token usage in this theoretical scenarios should be from 20% to 50% smaller when doing multiple smaller sessions vs one big. What has surprised me is that sweet spot is around 50k tokens (25k harness + 25k task). Now I want this to be verified by somebody smarter than me.