Skip to content

opensextant python lib v1.6.7 - improve date and coord extractors

Choose a tag to compare

@mubaldino mubaldino released this 30 Jul 22:31
· 13 commits to main since this release

opensextant python in this release

Changes:

  • opensextant: Bug - get_language("eng") returned Language("en_AU"). Fixed. call get_language() and load_language() more accurately records Language metadata by language ISO codes. Some languages were keyed by locale as the default; Correct behavior get_language("eng") ==> Language("en", "eng"), and lookup by locale ID would return that specific item.
  • opensextant.extractors.xtemporal: Feature: North Am vs. European locales in dates have left certain dates undetected. 07.04.2021 is ambiguous, for example, but 21.04.2021 is not. That is likely April 21, 2021. More date patterns implemented in Xtemporal.
  • opensextant.extractors.xcoord: Bug: MGRS would trivially match patterns with UTC, GMT and other common units of measure. Fixed. Time patterns that include UTC, GMT or other common phrases are avoided

See full OpenSextant Xponents server releases here: https://github.com/OpenSextant/Xponents/releases, which will point you to Docker to download the full server image.

Usage: pip3 install --user opensextant-1.6.7.tar.gz

Major Calls:

Extract geographic and other named entities using Xponents REST:

    client = opensextant.xlayer.XlayerClient( url )
    tags = client.process( ... text ... )

Extract just coordinates from text using XCoord:

  extractor = opensextant.extractors.xcoord.XCoord()
  coordinates = extractor.extract( text )

EXPERIMENTAL - Extract and summarize geographic entities primarily when postal addresses, nationalities, etc are of interest and present in data; use Geotagger as below. This is a higher level wrapper around the Xponents API client.

    extractor = opensextant.xlayer.Geotagger( args )
    tags = extractor.summarize( ... text... )