Using Vale to help engineers become better writers

François Violette

Apr 4, 2023

From design docs to wikis, README files, inline comments in code or even articles like this one, engineers write more prose every day than they realize.

While they often write documents, many engineers find it hard to comply with or participate in defining writing standards. It might be because they are only accessible in a humongous PDF style guide read by nobody, or because the contributing experience feels alien and does not fit in their daily workflow.

In this post we will explore Contentsquare’s approach to making engineers develop a sense of ownership with content, with our public technical documentation as a use case.

Engineers versus writers?

In small organizations, engineers and writers are often the same people. But even in larger teams where the engineer-to-writer ratio usually varies between 7:1 and 40:1, the boundaries can be fluid.

Regardless of size, engineers are the subject-matter experts and as a technical writer or a product manager, you will not succeed to write effective technical documentation without proactively and genuinely engaging with them.

This is why the Docs as Code model has become the standard for developer docs. It takes both the organization staff and engineer persona specifics into account by making engineers first-class contributors to documentation.

We expect engineers to write docs using tools of the trade: content is written in their IDE using plain text markup, changes are tracked by version control, website builds are automated. Like software development, writing becomes collaborative and distributed by design.

The developer loop and the writer loop

The interplay between the engineer and writer roles is clearly articulated by Kelsey Hightower in his foreword to Docs for Developers:

[The developer loop is basically a workflow that involves:]

Attempting to understand the problem

Searching for an existing solution everywhere possible

If a solution exists, prove to oneself the solution works

Pushing the solution found to production

In this process, documentation is expected to help developers through each of these steps. Documentation is a feature. It’s actually the first feature of a project developers will interact with.

And so similar to the developer loop, there’s the writer loop.

During the writer loop, we create information our developer users want during the developer loop. Aligning both loops is critical to set up a project for success.

Unlocking engineering contributions at Contentsquare

The Contentsquare engineering team has grown significantly in the past few years and this triggered a number of challenges, one of them being the scalability of our documentation process.

Getting docs right would take time, rewrites, and migrations. We naturally gravitated towards a Docs as Code workflow by making our engineers actors in both loops.

We made some tactical moves to foster contributions from our engineer-writers. Some of our high-level considerations were:

Prevent context switching between writing code and writing docs.
Maintain consistency between writers from different teams and backgrounds.
Automation and reproducibility will lead to engineering buy-in.
If engineers enjoy writing, they will contribute more.

Among the improvements that we made to our home-grown docs site, there is one tool that checked all these boxes and focused specifically on content…

Vale, computer-assisted editing

Vale is a prose linter. Just like a code linter, it ensures your content is compliant in terms of spelling, terms, grammar and everything else that you instruct it to check.

Vale is the programmatic companion to your style guide. It is particularly suited to lint technical content. Whether you write in Markdown, AsciiDoc, or reStructuredText, Vale converts your content on the fly to HTML and validates the result.

Vale is open source and has been adopted by many large organizations. It is also faster than most alternatives.

Unlike tools like Grammarly, Vale runs offline so your content is never sent to a remote server. This makes it great for Contentsquare because we try to avoid vendor lock-in and black boxes.

Vale is not a writing assistant: it only surfaces errors for writers to fix, with the right cues and pointers. It does one thing and does it well.

We want to be in charge of our standards, tailor tools to our needs, and design editorial rules that reflect collective agreement.

We need control over what to check and when. Vale provides all of that.

Getting started with Vale

One of the most valuable things Vale offers out of the box is the ability to leverage existing style guides, such as Google’s or Microsoft’s.

We will start off using the Google package and then we will add some custom rules.

Setup

You can install Vale using many package managers. On macOS with Homebrew, you can just run brew install vale.

Vale’s default configuration file is .vale.ini at the root of your project. You can generate one which includes the Google package:

StylesPath = vale-styles

MinAlertLevel = suggestion

Packages = Google

[*]
BasedOnStyles = Vale, Google

Within your project, run vale sync to download the package, which is unzipped in your StylesPath.

❯ vale sync
SUCCESS  Downloaded package 'Google'
Downloading packages [1/1] ████████████████████ 100% | 1s

And then run Vale via the CLI to lint your content files. Assuming you have Markdown files in a src/pages directory, just run:

❯ vale src/pages --glob='*.md'

src/pages/uxa-en/index.md
6:5      error  Did you really mean                 Vale.Spelling
                'Contentsquare'?
69:149   suggestion  In general, use active voice   Google.Passive
                     instead of passive voice
                     ('being blocked').
80:116    suggestion  Feel free to use 'don't'      Google.Contractions
                   instead of 'do not'.
389:9    error  'the' is repeated!                  Vale.Repetition
2413:95   suggestion  Spell out 'ISO', if it's      Google.Acronyms
    unfamiliar to the audience.
5557:74  error  Did you really mean                 Vale.Spelling
                'screenview'?

src/pages/android/4-20-0/index.md
...

✖ 673 errors, 12910 warnings and 14878 suggestions in 82 files.

The run is complete but the Vale log is too big and not actionable. A fair share of rules from the Google style guide don’t apply to our content so we’d rather cherry-pick some of them:

StylesPath = vale-styles

MinAlertLevel = suggestion

Packages = Google

[*.md]
# Adding the Google style here enables all its rules
BasedOnStyles = Vale

# So we choose to enable only selected rules
Google.Contractions = YES
Google.FirstPerson = YES
Google.Gender = YES
Google.Passive = YES
Google.We = YES

Running Vale again produces a more useful report, where every log line is something to act on.

✖ 102 errors, 501 warnings and 889 suggestions in 82 files.

Creating your own rules

Vale rules are plain YAML files with various extension points such as existence, substitution, spelling, or metric.

Project structure

Let’s create a vale-styles/Contentsquare folder which will contain our specific rules:

├── .vale.ini
├── package.json
├── README.md
├── src
├── vale-styles
│ └── Contentsquare
│     ├── Admonitions.yml
│     ├── English.yml
│     ├── ExternalLinks.yml
│     ├── InternalLinks.yml
│     ├── Level.yml
│     ├── MergeConflictMarkers.yml
│     ├── Recommendations.yml
│     ├── Repetition.yml
│     ├── Spelling.yml
│     ├── Terms.yml
│     └── Todo.yml
│     ...
│ └── Google
│     ...

We reference this folder in our configuration file and remove the built-in Vale style.

StylesPath = vale-styles

MinAlertLevel = suggestion

Packages = Google

[*.md]
BasedOnStyles = Contentsquare

Google.Contractions = YES
Google.FirstPerson = YES
Google.Gender = YES
Google.Passive = YES
Google.We = YES

Our first rule

We create the Spelling.yml rule to enrich the built-in dictionary with our entries, under filters. Many Vale rules work with regular expressions.

extends: spelling
message: 'Spelling check: "%s"?'
level: warning
filters:
  - '[aA]nonymiz(e|ation)'
  - 'Contentsquare'
  - 'screenview'
  ...

Adding these exceptions lowers the number of log lines in the Vale report even more. There shouldn’t be any more false positives.

Some of our custom rules

Once you get familiar with the YAML syntax, it is very easy to implement your style guide. While we are not yet ready to properly publish ours, we can share some custom rules we found useful.

Repeats

Repeats are flagged automatically by most WYSIWYG authoring tools but it is still important to check for them in Vale. Here is an example straight from the Vale docs:

extends: repetition
message: '"%s" is repeated.'
level: error
alpha: true
tokens:
  - '[^\s]+'

Terms

A few years ago, we used ContentSquare or even Content Square instead of Contentsquare. It always takes time to change habits and even today some of us still make the occasional mistake. To fix this, we can spot incorrect forms and directly suggest the correct one instead.

extends: substitution
message: "Use '%s' rather than '%s'."
level: warning
ignorecase: false
action:
  name: replace
swap:
  content[sS]quare|ContentSquare|Contentqsuare|Contentsqure: Contentsquare
  cpu: CPU
  documentations: docs
  [Jj]avascript: JavaScript
  XCode|xcode: Xcode

Grammar

We found adjectives used without hyphens in our docs, for instance a project level flag instead of the correct a project-level flag.

With part-of-speech tagging support, Vale can report such occurrences:

extends: sequence
message: "Use %[1]s-level, as it is an adjective here."
level: warning
tokens:
  - tag: NN|JJ
  - tag: NN
    pattern: 'level'
  - tag: NN

Inconsistent markup

We used to write **Note**, WARNING:, or _Tip_:, without any standard. We have since migrated to using an <Aside /> component. This rule helped us migrate:

extends: existence
message: 'Use <Aside /> for admonitions'
level: warning
ignorecase: false
scope: raw
raw:
  - '(NOTE:|TIP:|(?<! )WARNING:|\*\*(Note?s|Warning|Tip):?\*\*)'

💡 The scope: raw option allows to parse the unprocessed Markdown instead of the final HTML.

Readability

Vale supports readability metrics out of the box. We can check our content against a Flesch-Kincard grade level to fine tune:

extends: metric
message: "The grade level is %s. Aim for 8th grade or lower by using shorter sentences and words."
level: suggestion
formula: |
  (0.39 * (words / sentences)) + (11.8 * (syllables / words)) - 15.59
condition: "> 1"

Internal link format

We don’t allow using absolute URLs to link to other content files in our docs because it breaks preview environments which have a unique URL. With this rule we can enforce the links between source files to only be relative:

extends: existence
message: 'Do not use absolute URLs when linking to other pages'
level: warning
ignorecase: true
scope: raw
raw:
  - 'docs.contentsquare.com'

Merge conflicts

All of us developers probably committed an unclean merge at least once… Merge conflicts happen frequently with content too so let’s implement a safety net:

extends: existence
message: 'Merge conflict marker "%s" found.'
level: error
scope: raw
raw:
  - '\n<<<<<<< .+\n|\n=======\n|\n>>>>>>> .+\n'

This rule will flag occurrences like this one:

<<<<<<< HEAD
It means your PII is safe, as long as you use a test user account.
=======
It means your Personal Data is safe, as long as you use a test user account.
>>>>>>> fix/pii-personal-data

Going further

For everything that’s not covered by Vale’s declarative syntax, you can use Go-like scripts.

However, if those scripts get too large, it may be a sign that you’ve gone too far with the linting or that you need to combine Vale with another external tool.

Getting feedback at the right time

As we fine-tune the “writer loop”, we want to encourage engineers to write content without interrupting their flow while still giving them useful hints at the right time.

IDE plugins

Vale being a CLI tool, it can be used in many contexts, including IDEs.

Not only IDEs is where engineers spend a fair share of their time but feedback can also distilled rather than spit out in a large log file. It’s also available live, as you type.

Vale used with the VSCode extension

Continuous Integration

Plugging Vale to our Continuous Integration workflow allows us to surface errors to all contributors with every commit, even when they don’t use IDE plugins.

Here is a GitHub Action workflow which uses the official Vale GitHub Action.

name: Lint prose
on: pull_request

jobs:
  lint-prose:
    runs-on: ubuntu-latest
    steps:
    - name: Checkout
      uses: actions/checkout@v3

    - name: Vale
      uses: errata-ai/vale-action@reviewdog
      with:
        files: src/pages
        vale_flags: --glob=!*/*-fr/*
        filter_mode: nofilter
        reporter: github-pr-check
      env:
        GITHUB_TOKEN: ${{secrets.GITHUB_TOKEN}}

It captures all errors just as it would locally and adds Vale annotations on modified lines only:

Vale GitHub Action drops annotations on pull requests

Getting to zero

Implementing Vale, you might find yourself go down a rabbit hole. Try to resist rolling out every single rule you can think of at the beginning!

We’d like to make a global fix pass and move on. However, like code, technical content is dynamic and often rewritten. In this process, we find Vale to be increasingly useful with every iteration.

It could be a while until Vale reports ✔ 0 errors, 0 warnings and 0 suggestions and that’s fine.

Let’s not forget that part of building standards is making sure they will be applied: we encourage you to dispatch the bulk fix pass among contributors to build habits with this new workflow.

When we consider term and spelling exceptions for instance, we often need to involve subject-matter experts anyway. The point is to agree about whether you can use a term, why, and how. This is where the value lies.

After involving the relevant stakeholders, you can take decisions, enforce them in Vale, and see how they unfold!

Next steps

We still have a lot to improve, but not necessarily by adding many new rules. We want to focus more on where and how Vale is used in the “writer loop”.

Providing feedback earlier

Choosing when Vale runs is a trade-off between site build time and user experience. We chose to make Vale checks non-blocking for the moment: errors and warnings are not preventing our docs site to build. As docs ownership grows, we wish to enforce Vale to run before every commit, using the Vale pre-commit hook.

Bringing the style guide to authors

We want to explain errors with more accuracy.

We’ll soon lift the bulk of our Vale patterns into an internal web style guide. We’ll make sure to supply links to specific parts of the style guide in every Vale rule using the link key. There’s only so much you can fit in an IDE contextual window; engineers must be able to access the full explanation, if they need to.

Create a Contentsquare Vale package

We are already using Vale in multiple projects and repositories (including this blog!). To enforce a consistent style across all our content, we’d like to build a single Contentsquare package. Then, one would only need to run vale sync to synchronize the common rules from any project, just like we did earlier with the Google package.

Conclusion

Two years ago, only a few engineers were confident enough to modify our public documentation. This resulted in a lot of friction and outdated content.

Since adopting a modern Docs as Code workflow with Vale as a linter, contributions have grown significantly. We have received enthusiastic feedback from engineers implementing Vale, including recent onboardees!

Our engineers are now more efficient writers and they feel empowered to contribute to the style guide, which creates a very positive feedback loop.

If like us you find Vale useful, remember that the project is open to donations!