kronosparser: Python package for parsing time from text
10 Mar 2020

This post contains a few basic examples of how to use the kronosparser package. I particularly don’t like the name, but it’s descriptive enough to know what it does. This post is just an extended version of the readme document available in Github.

Why building this package?

We built this package together with @roalonso back in the Sudo Technologies, Inc. days, which we decided to open source. The primary use case for this package was to parse input text from users using the Sudo Chatbot. We needed to extract some metadata from input text (e.g., dates), which were then used to decide the upcoming actions that the chatbot had to do. And now with this package, you can do the same for any use case you’d like: parse any type of time instance from text and translate it into a date, datetime, or time interval. This could be for either future dates or past dates, and you can set the timezone if available.

This package is based on a Parsing Expression Grammar defined using pyparsing. If you have any comments, feel free to create issues or send pull requests in the Github repository.

Installing the package

In the meantime, while I finish adding this as a pypi package, you can install it directly from the Github repository, as shown below,

pip install git+https://github.com/Zapship/kronosparser.git

Usage

To use it, you need to import the parse_dates method. There are 3 parameters you can define, besides the input text for your parsing use case:

  1. future: which states whether you are parsing a text that is supposed to be in the future or the past (particularly helpful for chatbots that asks users questions frame in the past or the future)
  2. interval_to_date: useful when you have an interval, but you want to just pick a date (usually the closest date).
  3. timezone: the timezone you wish to use for your specific use case.

A straightforward example of how this package can be used is the following:

[{'end': 21,
      'parsed': {'datetime': '2020-03-11 14:20:38-07:00'},
      'start': 18,
      'text': 'now'},
     {'end': 39,
      'parsed': {'datetime': '2020-03-12 12:00:00'},
      'start': 26,
      'text': 'tomorrow noon'},
     {'end': 53,
      'parsed': {'interval': {'end': '2020-03-22', 'start': '2020-03-16'}},
      'start': 44,
      'text': 'next week'}]

Note that the example above has 3 types of identified instances: a datetime with timezone, a datetime without timezone, and a time interval.

Another simple example is just tomorrow, which is parsed into a date format without timezone.

[{'end': 29, 'parsed': {'date': '2020-03-12'}, 'start': 21, 'text': 'tomorrow'}]

The following example shows how the future flag option works. If you parse an input text with a date (e.g., friday) with the future option set True, it will find the closest instance of the date but in the future.

[{'end': 6, 'parsed': {'date': '2020-03-13'}, 'start': 0, 'text': 'friday'}]

However, if you use the future flag set to False, it will find the closest instance of the date, but in the past.

[{'end': 6, 'parsed': {'date': '2020-03-06'}, 'start': 0, 'text': 'friday'}]

Very important to note that if you don’t define the timezone, it will use US/Pacific by default. Also, if no parameters are specified, it will use the future flag as False and the interval_to_date flag as True. Many more examples are available in the package’s test cases.

Let me know if this package is useful and/or you’d like to see it as an official pypi package, which I plan to do, but it’s buried in my backlog. This way, I can prioritize that item from my never-ending TODO list.