Resources Blog Detecting Inclusive Language in My Codebase With Sonatype ...

Detecting Inclusive Language in My Codebase With Sonatype Lift

This past Monday Sonatype employees around the world took the day off of work to reflect on the American holiday of Juneteenth. One of the discussions on our internal Slack was around the subject of inclusive language and its impact in the workplace. We concluded that continually reexamining our unconscious bias is key. This isn't always easy especially when terms like "master" and "slave" have been baked into computer networking and architecture models for decades.

In recent years, GitHub took steps to remove some of these terms from the platform, specifically renaming the default branch to "main". But surely there's more we can do to help eradicate the use of non-inclusive terms from the software industry.

This week I had a chance to play around with Lift's extensibility API to address just this issue. This feature allows me to add custom static analysis tools to run on my GitHub pull requests. I found an open-source tool written by an engineer at Datadog: Woke. Woke is a linting tool that searches code bases for language that is not inclusive.

The Lift documentation provides a really simple set of functions that need to be defined to add a custom tool:

  • version: Returns the version of the tool.
  • applicable: Determines if the custom tool should be run on this particular code base. For example, some tools may apply only to specific programming languages.
  • run: Analyzes the code base and returns any findings in the form of the following JSON object:
{ "type" : <string>,
"message" : <string>,
"file" : <string>,
"line" : <int>,
"details_url" : <optional string>

Using these guidelines, I wrote a simple script that downloads the Woke project, runs it on the current repository, and converts the output to Lift consumable JSON. When I ran Lift on my simple Hello World demo and a pull request this was the result:

Screen Shot 2022-06-23 at 4.15.07 PM
From the Lift dashboard:



Woke can also be customized around the terms it searches for.

To try it out on your own projects, simply reuse my .lift/woke file and add the following line to .lift/config.toml in your projects' root directory: 

customTools = [ ".lift/woke" ]

To make your own custom plugin, choose your favorite static analysis tool and follow the API above. More details and examples, including simple test cases can be found here.

What other types of language would you like to see detected on pull requests? Profanity is another one that comes next to my mind to help make open-source projects more welcoming for everyone.

Picture of Theresa Mammarella

Written by Theresa Mammarella

Theresa is a software engineer and developer advocate who enjoys helping developers harness the full potential of their tools to create innovative solutions. Theresa has a background as an open source contributor to Java Virtual Machine and compiler projects at IBM and Red Hat. She has now embarked on a new journey into the exciting realm of security and static analysis tooling, advocating for the needs of developers. When she's not coding, Theresa loves to spend her time volunteering with animal rescues and exploring the great outdoors, where she can often be found hiking, camping, or simply soaking up nature's beauty.