Regular Expressions Text Data Generation

Generating Text Data from Regular Expressions

This article is about generating text data from regular expressions. Scenarios where you might want to generate text data from regular expressions include when you want sample data or fake data, or simply because writing out all text candidates is cumbersome and writing them in regular expressions is cleaner when taking notes, etc. The following is exactly about that.

Shou Arisaka
3 min read
Nov 1, 2025

This article is about generating text data from regular expressions.

Scenarios where you might want to generate text data from regular expressions include when you want sample data or fake data, or simply because writing out all text candidates is cumbersome and writing them in regular expressions is cleaner when taking notes, etc.

The following is exactly a question about that:

java - Using Regex to generate Strings rather than match them - Stack Overflow

After a quick search, including Generex mentioned above, I found one more Ruby library.

mifmif/Generex: A Java library for generating String from a regular expression. tom-lord/regexp-examples: Generate strings that match a given regular expression

This one is newer and being Ruby, it’s more approachable. The Java one is quite old, but functionally they don’t seem very different. However, since Ruby is also famous as a “language strong in regular expressions,” the Ruby one seems to have higher expectations.

So I tried using it.

Installation is as follows:

cat >> Gemfile
gem 'regexp-examples'

bundle install

By the way, I recently made a tool for bulk Google searches. When searching for something, it’s in English, but since I’m not a native speaker either, there may be more appropriate words. For example, sometimes “create” is more appropriate than “make,” or vice versa. In such cases, you can write how to (make|create) something and generate text data with regexp-examples, then Google search with that.

how to make something
how to create something

I’ve gotten so carried away with my programming enthusiasm that I even thought “I want to write a novel in a programming language.” For efficient Googling, I tried the following regular expression with regexp-examples:

require "regexp-examples"

# puts /a*/.examples #=> ['', 'a', 'aa']

puts /(writing|construct|make) (novel )?story (by|with) (programming|predicate logic)/.examples

Image

The result is as follows. Wonderful.

writing story by programming
writing story by predicate logic
writing story with programming
writing story with predicate logic
writing novel story by programming
writing novel story by predicate logic
writing novel story with programming
writing novel story with predicate logic
construct story by programming
construct story by predicate logic
construct story with programming
construct story with predicate logic
construct novel story by programming
construct novel story by predicate logic
construct novel story with programming
construct novel story with predicate logic
make story by programming
make story by predicate logic
...

As a side note, it seems programming novels is still too early for humanity. That’s unfortunate.

Share this article

Shou Arisaka Nov 1, 2025

🔗 Copy Links