Elijah Potter

Finding the Active Voice

Last week, Harper hit a stroke of luck. It was featured on MakeUseOf. The downstream social media posts collectively garnered nearly 300,000 views and drove a significant amount of trafﬁc to our site. This all happened in the wake of LanguageTool shutting down the free edition of their software. These two events compound to bring more attention than ever to Harper — which is amazing. It means our mission and values are resonating with people, which is always a good thing.

Reading the MUO article, we hear a lot of great things about Harper. We also hear that “it skips many of the premium bells and whistles of Grammarly”. The author goes on to explain that the privacy Harper provides more than makes up for any missing premium features, but the point is clear: those features are deeply desired.

So, the question becomes: Which of Grammarly’s features should we work on ﬁrst, and how?

The Active Voice

I propose that we should focus ﬁrst on helping our users ﬁnd the active voice. For context, the active voice is the style of writing where subject of the verb in a clause is the doer of the action. This is in contrast to the passive voice, which is where the subject is the receiver of the action. For example: the sentence, “The postal carrier was bitten by the dog” is written in the passive voice, while the equivalent sentence, “The dog bit the postal carrier.” is written in the active voice.

Text written in the active voice is commonly viewed to be more authoritative, conﬁdent, and easier to understand. Being able to help users use the active voice is one of the most commonly requested features in Harper, and including the feature would be a huge step towards competing directly with Grammarly Premium.

How should we go about helping our users with their active voice?

How It Could Be Done

I spoke brieﬂy with Matt, and we agreed that a two tier solution would be best. A fast algorithm or model would detect instances of the passive voice, letting a larger more computationally expensive model generate a modiﬁcation in the active voice.

Fortunately, there is already extensive literature on the detection of the passive voice. In particular, I found the PassivePy paper stimulating. In fact, we can implement their ideas quite easily using the Weir language already baked into Harper. I have done so in a private branch. It turned out to be ~20 lines of code. That is pretty good bang for the buck!

The second piece, which has to do with the actual simpliﬁcation of text and conversion from the passive voice to the active voice is a tad more complex.

Matt and I agree that it will require the use of a larger language model. The trouble is that it cannot be too large. Harper’s shtick is that it is fast, private, and that everything runs directly on our user’s devices. That means whichever model we use for our style transfer will need to be relatively small.

I believe the best solution to this problem is to take an off-the-shelf model, like one of Google’s T5 models, and ﬁne tune it for the speciﬁc types of style transfer we need. These are relatively small models (quantized, they can ﬁt into spaces under 65 megabytes) and they run quite quickly, even on older hardware that doesn’t have access to matrix multiplication accelerators. There is prior art for running this at 50 tok/s in Chrome without WebGPU on a single core. The best part is that they’re under the Apache-2.0 license!

How This Fits in with the Weirpack Project

These models are small, but they’re not quite small enough to be a part of the standard distribution of Harper. I believe this should be an opt-in feature, and the best way to do that is to expose the functionality via a Weirpack. If you don’t know what a Weirpack is, I highly suggest you read my previous blog posts on the subject.

Everyone who wants this additional functionality could just enable it in the marketplace. This continues our goal to make Harper as customizable as our users want, while providing sensible defaults.

What’s Next?

Once we have the system in place to detect and provide suggestions for the active voice, we will be prepared to do other kinds of transformation, like for adjusting formality or tone.

I’m really excited about this project, and I can’t wait to get started.

Published January 26, 2026 at 7:00 AM

Proofread by Harper.

Bluesky

Facebook

Comments

Other Stuff

Hacker News sans AI

I like HackerNews, but I don't love that so much of it has turned into discussion of a single topic: AI. This is a version of HackerNews, filtered to remove any article focusing on __AI__. Refreshes about every ten minutes.

Do Not Type Your Notes

It didn't work for me, and if you reading this, it probably won't work for you either.

Markov Chains Are the Original Language Models

Back in my day, we used math for autocomplete.