This Rasa Importer Helps You Focus on Writing Training Data

2 min read
This Rasa Importer Helps You Focus on Writing Training Data

I have been writing a lot of training data for my Rasa chatbot and it is a very tedious task trying to tweak your pipeline because when something new (a phrase or message) was introduced to the chatbot during actual conversation, it messes up in identifying the correct intent and entities.

As a lazy programmer, I do not want to write repeated training data. For example:

## intent:get_name
- My name is Nikola Tesla
- My name is Thomas Edison
- I am Thomas Edison
- She is Marie Curie

For a programmer, thinking of a name is also a hard task (e.g. naming your project). The above example is just a simple one. Developing a chatbot requires you to write more and more training data, more intents, more entities... and more repeated data.

I needed to focus on just writing the intents and less on thinking of unnecessary data like name, address, age, etc. For this problem, I came up with just using a "placeholder" and randomly generate a data for that placeholder.

My solution was to write PlaceholderImporter, a custom importer for Rasa that replaces placeholders with fake data. First step is to write your training data like below:

## intent:get_name
- My name is {name}
- I am @name

PlaceholderImporter accept 2 styles of placeholder, by using curly braces ({}) and by using the @ symbol. Curly braces are common in Python string formatting while @ is used in other languages. You can mix them but my advise is to use only one style for writing placeholders.

PlaceHolderImporter is included in rasam package. To install rasam, use pip:

pip install rasam

To use PlaceholderImporter, add the following into your Rasa config.yml.

importers:
  - name: rasam.PlaceholderImporter
    fake_data_count: 10  # default value is 1

If not specified fake_data_count defaults to 1. This setting allows PlaceholderImporter to generate more unique fake data from a single training data. fake_data_count acts as a "multiplier".

Conclusion

I have been using PlaceholderImporter for a while now and it really helps me focus on what I need the most for my Rasa chatbots, "writing shorter training data".

What does not work?

For some reason, rasa test does not use the custom importer specified.

Read more articles like this in the future by buying me a coffee!

Buy me a coffeeBuy me a coffee