Search and Replace Quoted Strings in YAML
4 min read
Photo by Beth Macdonald on Unsplash
Regex capture groups and bash positional arguments to replace text in VS Code
I've been knee deep in OpenAPI specs in my current role as an API Product Manager. While the work is fulfilling, there are some very mundane aspects to onboarding a development team when using the OpenAPI specification; particularly, the formatting of YAML documents. There is a wonderful readability aspect to working with YAML when certain behaviors are adhered to. Without getting into the firestorm of using single quote ('), double quote (") or no quotes, you can read more about that here, I find myself reviewing several documents where our development team are diverging from each other in the way they are documenting our API's. Certainly, there is a selfish aspect of wanting consistency while reviewing these documents. I have 1000's of specs to review for our internal developer ecosystem and I'm trying my best to automate as much review work as possible.
Ok, let's get to it: I have documents exceeding 10-thousand lines of code and there are 100's of enumerations throughout. I needed a quick and easy way to clean up the excessive quotes wrapping enum strings because it looks hideous!
Typically, I'm reviewing OpenAPI docs in VS Code because first and foremost, it's my favorite editor, and secondly, the plethora of extensions for making my life easy to read 10-thousand line documents.
some of my faves: Indent-Rainbow Spectral API Linter Prettier
ctrl + f /
cmd + f and select the
.* icon, or toggle regex with
alt + r /
cmd + r
There is probably some caveat here about using
example(s) in OpenAPI because there are a few different ways to represent them, depending on the version. I'm using
example at the schema level which means we don't have any representation of an
array example which can be overwritten by my regex.
We start with an OpenAPI document similar to our Animal example:
openapi: 3.0.3 info: description: This is a great representation of an Animal version: '1.0.0' title: Animal contact: name: 'jeremy' email: 'firstname.lastname@example.org' paths: /endpoint: get: ... components: schemas: Animal: description: Some type of living creature type: object required: - breed properties: breed: type: string enum: - 'GERMAN SHEPARD' - 'BEAGLE' - 'RETRIEVER' example: 'BEAGLE'
I'm looking for any representation of an enumeration in the file.
The formatting is:
- any line starting with spaces or tabs
- eventually a
- followed by a single space to indicate the array of enumerations available.
- Then, I'm expecting a string wrapped in some form of quotes.
^(\s+-\s) ('|"|`) (\w+((\s|-)+\w+)*) ('|"|`)$ ^---1---^ ^--2--^ ^--------3--------^ ^--4--^ Capture Groups: one: starts with any number of spaces or tabs, a dash and exactly one following space two: finds any type of quote. single('), double("), backtick (`) three: finds any word representation, including spaces, underscore, or hyphens four: finds any type of quote. single('), double("), backtick (`)
VS Code uses ripgrep as their regex search engine, so we are able to use positional arguments provided by the bash scripting engine. This is why I've split the regex into capture groups. Each capture group is returned as a separate argument and we can utilize these arguments to replace our enum strings without the ugly quotes.
The arguments follow a numbering sequence and can be called directly in the
replace text field.
If we were to look at the argument sequence, it breaks down like this:
$0is equivalent to the first match of the entire
findcommand. In our case, the result of the regex pattern we used to search
......- 'GERMAN SHEPARD'
$1is the indentation, dash and single space before the string
$2is the first set of quotes
$3is our enum string
$4is the final set of quotes
Now we can use these arguments to replace the text without the quotes.
$1$3 $1 holds the first capture group which is " - ". This is the indentation and dash $3 holds the third capture group which is our enum string *without* quotes.
properties: breed: type: string enum: - GERMAN SHEPARD - BEAGLE - RETREIVER example: 'BEAGLE'
It's so beautiful!! Now I can get back to reviewing my documents without stabbing my eyeballs with a bunch of ugly quotes.
If you like tips like these, let me know!