Building rtl-aid: CI-Native Documentation for RTL Projects
rtl-aid is a small documentation and linting toolkit for Verilog and SystemVerilog projects. It generates one Markdown page per RTL module, preserves the human-written description, and keeps the mechanical sections such as ports, parameters, and module calls in sync with the source.
The project also includes rtllint, a companion command that runs Verilator lint and tags warnings inline in the RTL. The goal is not to replace a full EDA flow. The goal is to make an unfamiliar RTL codebase easier to inspect, review, and maintain from the command line and from agentic coding tools.
Repository: https://github.com/vishwaksen-1/rtl-aid
What this post is about
This post explains why I built rtl-aid, what it currently does, and the design constraints behind it.
It covers:
Generating module documentation from RTL.
Preserving manual descriptions while updating generated sections.
Building a simple module dependency graph.
Making documentation checks usable in CI.
Tagging Verilator warnings directly inside source files.
The limits of using a lightweight parser instead of a full HDL frontend.
It does not cover a complete Verilog parser, formal verification, synthesis integration, or a replacement for tools like Verilator, Yosys, Surelog, or commercial EDA suites.
Background
RTL projects become hard to read in a very specific way.
In software, a new contributor can often start with package names, imports, tests, and runtime entry points. In RTL, the important structure is usually spread across module declarations, parameters, port lists, instantiations, include paths, testbenches, and build scripts. Even a small design can make you ask basic questions before you can make progress:
What modules exist?
Which module instantiates which child?
What are the top-level inputs and outputs?
Which parameters configure this block?
Is this file part of the design or only a testbench?
Where are the existing lint issues?
The obvious answer is "read the source," but that does not scale well when you are only trying to build a mental map. A lot of RTL source is structural information surrounded by implementation details. I wanted a way to extract the structural layer first, then decide which files deserved deeper reading.
That became rtl-aid.
Why this matters
The practical value is not that the generated documentation is beautiful. The value is that it is cheap to regenerate, predictable in shape, and good enough to guide the next action.
For a human reviewer, rtl-aid turns a pile of .v and .sv files into a browsable module index. For an AI coding agent, it provides a much smaller representation of the design than raw RTL. Instead of spending context on every block and expression, the agent can first inspect the generated Markdown and graph.json.
A typical first pass looks like this:
rtldoc -d rtl/ -o .agent/docs/ --json-graph
After that, each module has a Markdown file with the same sections:
Description
Parameters
Inputs
Outputs
Inouts
Calls
Called By
The Description section is human-managed. Everything else is generated. That split is the core of the tool: humans provide meaning, the tool maintains the boring structural truth.
Constraints
The first version of rtl-aid is deliberately small.
Runtime: Python 3.7+.
Dependencies: no Python runtime dependencies.
External tools:
rtllintrequires Verilator, whilertldocis standalone.Parser: regex-based, not a full Verilog/SystemVerilog frontend.
Scope: one module per file.
Supported style: ANSI-style Verilog-2001/SystemVerilog module headers.
Output: plain Markdown and JSON.
CI behaviour: deterministic exit codes and diff-aware writes.
These constraints shaped most decisions. I wanted something that could be installed with:
pip install rtl-aid
and then run in a repository without pulling in a parser stack, a web app, or a database. That meant accepting a narrower syntax target and making the limitations explicit.
The tradeoff is clear: Veridoc is not the right tool if your codebase relies heavily on pre-2001 port declarations, multiple modules per file, typedef-heavy declarations, or macro-expanded structure. It is useful when the codebase follows common ANSI module style, and you want fast structural documentation.
System design
Veridoc has two commands:
| Command | Purpose |
|---|---|
rtldoc |
Generate and maintain per-module Markdown docs. |
rtllint |
Run Verilator lint and tag warning lines inline. |
The documentation flow is:
Scan
.vand.svfiles from directories or explicit file lists.Ignore known testbench suffixes such as
_tb.v,_tb.sv,_bench.v, and_testbench.sv.Strip comments before parsing.
Extract the first module declaration from each file.
Parse parameters, inputs, outputs, and inouts.
Detect instantiations of known modules.
Build the reverse
called_bygraph.Generate or update Markdown.
Optionally write
graph.json.
The graph export is intentionally simple:
{
"cpu_core": {
"calls": ["alu", "decoder", "register_file"],
"called_by": []
},
"alu": {
"calls": ["mux4"],
"called_by": ["cpu_core"]
}
}
This makes it easy to consume from scripts, CI, or an agent. You do not need to parse Markdown to recover the dependency map.
Important decisions
Preserve descriptions
Generated documentation usually fails when it overwrites the only useful human text.
rtldoc avoids that by treating the Description section as user-owned. If a module doc already exists, the description is preserved and the generated sections are replaced. If the file is new, the description starts as:
TODO: Add description
CI mode can then fail if descriptions are still missing:
rtldoc -d rtl/ -o docs/modules/ --ci --print-errors
The result is a useful split of responsibility. The tool owns facts it can extract. The developer owns intent, context, and design notes.
Use diff-aware writes
Documentation generators can create noisy commits if they rewrite files every time they run. rtldoc only writes a Markdown file when the content has actually changed.
That makes it safe to run before every commit or inside CI. If nothing changed structurally, nothing gets touched.
Dry-run mode gives the same behaviour without writing:
rtldoc -d rtl/ -o docs/modules/ --dry-run
With -vv, rtldoc can also show section-level additions and removals, which is useful before committing a source change.
Keep the graph separate
Markdown is nice for humans. JSON is better for tools.
The --json-graph flag writes a machine-readable dependency graph next to the docs. This is especially useful for agent workflows. An agent can inspect the graph first, identify the top-level modules, and then read only the specific generated docs it needs.
That is the main agentic pattern:
rtldoc -d rtl/ -o .agent/docs/ --json-graph
Read the graph. Read the relevant module docs. Only then jump into the source.
Keep lint visible but non-blocking
rtllint is intentionally different from a strict CI lint gate. It runs Verilator and annotates the source line where a warning or error appears:
assign result = a + b; /* Check: Operator ADD generates 9 bits ... */
It also inserts small test metadata near the top of the file:
// lint-test: verilator --lint-only -Wall rtl/alu.v
// tb-test: tba
The comments are idempotent. Re-running the command replaces existing /* Check: */ tags instead of stacking duplicates.
This gives a review workflow where lint debt is searchable, visible in diffs, and close to the code that caused it. It does not have to block the build immediately.
Implementation notes
The parser is intentionally direct.
It strips comments, then looks for a module header of the form:
module <name> [#(...)] (...);
From there it extracts:
Parameters from the optional
#(...)block.Inputs, outputs, and inouts from the port list.
Comma-inherited directions such as
output reg a, b, c.Instantiations that match known module names.
That last part matters. rtldoc only records calls to modules it has already discovered in the current scan. This avoids treating every function-like token as a module instance.
The generated Markdown is predictable:
# alu
## Description
TODO: Add description
## Parameters
- DATA_WIDTH = 8
## Inputs
- clk
- rst
- operand_a
- operand_b
## Outputs
- result
## Inouts
- None
## Calls
- [mux4](mux4.md)
## Called By
- [cpu_core](cpu_core.md)
The important part is not the formatting. It is just so that every module page has the same shape.
Future work
Add Graphviz or Mermaid export for the dependency graph.
Build an MCP server wrapper so tools like Claude, Cursor, Devin, and other agents can call Veridoc without shelling out manually.
Add a GitHub Actions workflow template.
Improve parser coverage for common real-world RTL styles.
Adding an optional customizable Doc format, rather than enforcing the hardcoded format.
Add an optional backend using Tree-sitter, PyVerilog, Surelog, or another parser for projects that need deeper SystemVerilog support.
Add a cleanup command for removing lint tags.
Generate a browsable static HTML view from the Markdown and graph.
The larger direction is to make RTL projects easier to enter. Not by hiding the source, but by giving humans and agents a reliable map before they start reading every file.
That is the real point of rtl-aid: turn structure into a cheap artefact, keep it current, and let the deeper engineering attention go where it actually matters.





