← All content GitAra: How We Trained a 3B Function-Calling Git Agent for Local Use

GitAra: How We Trained a 3B Function-Calling Git Agent for Local Use

A 3B function-calling model that turns plain English into git commands. Matches the 120B teacher at 92% accuracy — 25x smaller, runs locally via Ollama.

View on GitHub

GitAra: How We Trained a 3B Function-Calling Git Agent for Local Use

GitAra = git + ara: your local stochastic parrot for git commands (with a knack for music).

We fine-tuned a small, tool-calling language model to turn plain English into git commands with the accuracy of a cloud LLM. Because it’s small, you can run it locally on your own machine — no API keys, no cloud dependencies, full privacy.


Results

ModelParametersAccuracyModel link
GPT-OSS 120B (teacher)120B0.92 +/- 0.02
Llama 3.2 3B Instruct (tuned)3B0.92 +/- 0.01HuggingFace
Llama 3.2 1B Instruct (tuned)1B0.90 +/- 0.01HuggingFace
Llama 3.2 3B Instruct (base)3B0.12 +/- 0.05
Llama 3.2 1B Instruct (base)1B0.0 +/- 0.01

The tuned 3B model matches the 120B teacher while being 25x smaller. The 1B model is within one standard deviation while being 120x smaller.

All models available in the HuggingFace collection.


The Task

A practical Git assistant that interprets natural language requests and outputs appropriate Git commands:

  • “what’s in the latest stash, show diff” → git stash show --patch
  • “push feature-x to origin, override any changes there and track it” → git push origin feature-x --force --set-upstream

We support 13 core Git commands: status, add, commit, push, pull, branch, switch, restore, merge, stash, rebase, reset, and log — deliberately excluding older checkout in favor of more modern alternatives.


Tool Calling Overview

The implementation uses JSON schemas following OpenAI’s function-calling format:

{
    "type": "function",
    "function": {
        "name": "git_add",
        "description": "Stage files for commit",
        "parameters": {
            "type": "object",
            "properties": {
                "files": {
                    "type": "array",
                    "description": "List of file paths to stage (use ['.'] for all files)",
                    "items": { "type": "string" },
                    "minItems": 1
                }
            },
            "required": ["files"],
            "additionalProperties": false
        }
    }
}

The model returns responses like:

{"name": "git_add", "parameters": {"files": ["README.md"]}}

A crucial feature is a do_nothing tool that allows the model to decline unreasonable requests instead of generating arbitrary commands.


Creating the Seed Dataset

We created approximately 100 examples showing requests paired with expected tool calls:

InputOutput
apply stash@{5}{"name": "git_stash", "parameters": {"action": "apply", "stash_ref": "stash@{5}"}}
merge vendor branch preferring ours{"name": "git_merge", "parameters": {"branch": "vendor", "strategy": "ours"}}
show 8 commits for current branch with graph{"name": "git_log", "parameters": {"limit": 8, "graph": true}}

Training the Student

We expanded the 100 seed examples into 10,000 training pairs using the distil labs platform’s generation capabilities. This allowed fine-tuning Llama 3.2 3B Instruct — a model with 25x fewer parameters than the teacher — to match performance.

Most queries take less than 2 seconds to return a response on an M4 MacBook Pro once the model is loaded.


Future Improvements

  • Constrained decoding to guarantee syntactically-valid JSON output
  • Multi-turn workflows for handling complex, iterative tasks
  • Quantization to reduce model size without significant performance loss

Conclusion

GitAra demonstrates a generalizable workflow for tool-calling scenarios applicable beyond Git assistance. While manual implementation requires substantial effort, the distil labs platform abstracts away most of the difficult parts.


Resources


Keep Learning