Skip to content

KeyError: 'accuracy' when processing Location History #7

@davidwilemski

Description

@davidwilemski

I'm new to both the dogsheep tools and datasette but have been experimenting a bit the last few days and these are really cool tools!

I encountered a problem running my Google location history through this tool running the latest release in a docker container:

Traceback (most recent call last):
  File "/usr/local/bin/google-takeout-to-sqlite", line 8, in <module>
    sys.exit(cli())
  File "/usr/local/lib/python3.9/site-packages/click/core.py", line 829, in __call__
    return self.main(*args, **kwargs)
  File "/usr/local/lib/python3.9/site-packages/click/core.py", line 782, in main
    rv = self.invoke(ctx)
  File "/usr/local/lib/python3.9/site-packages/click/core.py", line 1259, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/usr/local/lib/python3.9/site-packages/click/core.py", line 1066, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/usr/local/lib/python3.9/site-packages/click/core.py", line 610, in invoke
    return callback(*args, **kwargs)
  File "/usr/local/lib/python3.9/site-packages/google_takeout_to_sqlite/cli.py", line 49, in my_activity
    utils.save_location_history(db, zf)
  File "/usr/local/lib/python3.9/site-packages/google_takeout_to_sqlite/utils.py", line 27, in save_location_history
    db["location_history"].upsert_all(
  File "/usr/local/lib/python3.9/site-packages/sqlite_utils/db.py", line 1105, in upsert_all
    return self.insert_all(
  File "/usr/local/lib/python3.9/site-packages/sqlite_utils/db.py", line 990, in insert_all
    chunk = list(chunk)
  File "/usr/local/lib/python3.9/site-packages/google_takeout_to_sqlite/utils.py", line 33, in <genexpr>
    "accuracy": row["accuracy"],
KeyError: 'accuracy'

It looks like the tool assumes the accuracy key will be in every location history entry.

My first attempt at a local patch to get myself going was to convert accessing the accuracy key to a .get instead to hopefully make the row nullable but I wasn't quite sure what sqlite_utils would do there. That did work in that the import happened and so I was going to propose a patch that made that change but in updating the existing test to include an entry with a missing accuracy entry, I noticed the expected type of the field appeared to be changing to a string in the test (and from a quick scan through the sqlite_utils code, probably TEXT in the database). Given this change in column type, it seemed that opening an issue first before proposing a fix seemed warranted. It seems the schema would need to be explicitly specified if you wanted a nullable integer column.

Now that I've done a successful import run using my initial fix of calling .get on the row dict, I can see with datasette that I only have 7 data points (out of ~250k) that have a null accuracy column. They are all from 2011-2012 in an import that includes points spanning ~2010-2016 so perhaps another approach might be to filter those entries out during import if it really is that infrequent?

I'm happy to provide a PR for a fix but figured I'd ask about which direction is preferred first.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions