@@ -111,7 +111,7 @@ the drop down menu
111111 a. Pass this key as a parameter or within a dictionary
112112
113113 b. Create a JSON or YAML file. The default path is
114- ** .hdx \_ configuration .yaml** in the current user's home
114+ ** .hdx_configuration .yaml** in the current user's home
115115 directory. Then put in the YAML file:
116116
117117 hdx_key: "HDX API KEY"
@@ -203,16 +203,17 @@ virtualenv if not installed:
2032039 . Use configuration defaults.
204204
205205 If you only want to read data, then connect to the production HDX
206- server, replacing A_Quick_Example with something short that describes your project:
206+ server, making sure that you replace MyOrg_MyProject with something that
207+ describes your organisation and project:
207208
208- Configuration.create(hdx_site="prod", user_agent="A_Quick_Example ", hdx_read_only=True)
209+ Configuration.create(hdx_site="prod", user_agent="MyOrg_MyProject ", hdx_read_only=True)
209210
210211 If you want to write data, then for experimentation, do not use the
211212 production HDX server. Instead you can use one of the test servers.
212213 Assuming you have an API key stored in a file ** .hdxkey** in the
213214 current user's home directory:
214215
215- Configuration.create(hdx_site="stage", user_agent="A_Quick_Example ")
216+ Configuration.create(hdx_site="stage", user_agent="MyOrg_MyProject ")
216217
21721810 . Read this dataset
218219[ Novel Coronavirus (COVID-19) Cases Data] ( https://data.humdata.org/dataset/novel-coronavirus-2019-ncov-cases )
@@ -247,7 +248,22 @@ virtualenv if not installed:
247248 dataset.set_reference_period("PREVIOUS DATE")
248249 dataset.update_in_hdx()
249250
250- 15 . Exit and remove virtualenv:
251+ 15 . If you are storing your data on HDX, you can upload a new file to a
252+ resource:
253+
254+ resource = dataset.get_resource(0)
255+ resource.set_file_to_upload("PATH TO FILE")
256+ resource.update_in_hdx()
257+
258+ 16 . Alternatively, if you are using a URL to point to data held externally from
259+ HDX, you can mark that the data has been updated before updating the
260+ resource or parent dataset:
261+
262+ resource = dataset.get_resource(2)
263+ resource.mark_data_updated()
264+ dataset.update_in_hdx()
265+
266+ 17 . Exit and remove virtualenv:
251267
252268 exit()
253269 deactivate
@@ -270,7 +286,7 @@ facades set up both logging and HDX configuration.
270286The default configuration loads an internal HDX configuration located within the
271287library, and assumes that there is an API key file called ** .hdxkey** in the current
272288user's home directory ** \~ ** and a YAML project configuration located relative to your
273- working directory at ** config/project \_ configuration .yaml** which you must create. The
289+ working directory at ** config/project_configuration .yaml** which you must create. The
274290project configuration is used for any configuration specific to your project.
275291
276292The default logging configuration reads a configuration file internal to the library
@@ -335,34 +351,35 @@ appropriate keyword arguments ie.
335351
336352You must supply a user agent using one of the following approaches:
337353
338- 1 . Populate parameter ** user\_ agent** (which can simply be the name of your project)
339- 2 . Supply ** user\_ agent\_ config\_ yaml** which should point to a YAML file which
340- contains a parameter ** user\_ agent**
341- 3 . Supply ** user\_ agent\_ config\_ yaml** which should point to a YAML file and populate
342- ** user\_ agent\_ lookup** which is a key to look up in the YAML file which should be of
343- form:
354+ 1 . Populate parameter ** user_agent** (which should be the name of your
355+ organisation and project)
356+ 2 . Supply ** user_agent_config_yaml** which should point to a YAML file which
357+ contains a parameter ** user_agent**
358+ 3 . Supply ** user_agent_config_yaml** which should point to a YAML file and populate
359+ ** user_agent_lookup** which is a key to look up in the YAML file which should
360+ be of form:
344361
345362 myproject:
346363 user_agent: test
347364 myproject2:
348365 user_agent: test2
349366
350- 4 . Include ** user \_ agent ** in one of the configuration dictionaries or files outlined in
367+ 4 . Include ** user_agent ** in one of the configuration dictionaries or files outlined in
351368the table below eg.
352- ** hdx \_ config \_ json ** or ** project \_ config \_ dict ** .
369+ ** hdx_config_json ** or ** project_config_dict ** .
353370
354371** KEYWORD ARGUMENTS** can be:
355372
356373| Choose| Argument| Type| Value| Default|
357374| ---| ---| ---| ---| ---|
358- | | hdx \_ site | Optional\[ str\] | HDX site to use eg. prod, feature| test|
359- | | hdx \_ read \_ only | bool| Read only or read/write access to HDX| False|
360- | | hdx \_ key | Optional\[ str\] | HDX key (not needed for read only)||
361- | Above or one of:| hdx \_ config \_ dict | dict| Dictionary with hdx \_ site, hdx \_ read \_ only, hdx \_ key ||
362- | or| hdx \_ config \_ json | str| Path to JSON configuration with values as above||
363- | or| hdx \_ config \_ yaml | str| Path to YAML configuration with values as above||
364- | Zero or one of:| project \_ config \_ dict | dict| Project specific configuration dictionary||
365- | or| project \_ config \_ json | str| Path to JSON Project||
375+ | | hdx_site | Optional\[ str\] | HDX site to use eg. prod, feature| test|
376+ | | hdx_read_only | bool| Read only or read/write access to HDX| False|
377+ | | hdx_key | Optional\[ str\] | HDX key (not needed for read only)||
378+ | Above or one of:| hdx_config_dict | dict| Dictionary with hdx_site, hdx_read_only, hdx_key ||
379+ | or| hdx_config_json | str| Path to JSON configuration with values as above||
380+ | or| hdx_config_yaml | str| Path to YAML configuration with values as above||
381+ | Zero or one of:| project_config_dict | dict| Project specific configuration dictionary||
382+ | or| project_config_json | str| Path to JSON Project||
366383
367384To access the configuration, you use the ** read** method of the ** Configuration** class as follows:
368385
@@ -383,7 +400,7 @@ Configuration instances passed to the constructors of HDX objects like Dataset e
383400## Configuring Logging
384401
385402If you use a facade from ** hdx.facades** , then logging will go to console and errors to
386- file. If you are not using a facade, you can call ** setup \_ logging ** which takes
403+ file. If you are not using a facade, you can call ** setup_logging ** which takes
387404an argument error_file which is False by default. If set to True, errors will be written
388405to a file.
389406
@@ -409,13 +426,13 @@ Then use the logger like this:
409426
410427## Operations on HDX Objects
411428
412- You can read an existing HDX object with the static ** read \_ from \_ hdx ** method
429+ You can read an existing HDX object with the static ** read_from_hdx ** method
413430which takes an identifier parameter and returns the an object of the appropriate HDX
414431object type eg. ** Dataset** or ** None** depending upon whether the object was read eg.
415432
416433 dataset = Dataset.read_from_hdx("DATASET_ID_OR_NAME")
417434
418- You can search for datasets and resources in HDX using the ** search \_ in \_ hdx ** method
435+ You can search for datasets and resources in HDX using the ** search_in_hdx ** method
419436which takes a query parameter and returns the a list of objects of the appropriate HDX
420437object type eg. ** list[ Dataset] ** . Here is an example:
421438
@@ -464,9 +481,9 @@ and recommended, while JSON is also accepted eg.
464481
465482 dataset.update_from_json([path])
466483
467- The default path if unspecified is ** config/hdx \_ TYPE \_ static .yaml** for YAML and
468- ** config/hdx \_ TYPE \_ static .json** for JSON where TYPE is an HDX object's type like
469- dataset or resource eg. ** config/hdx \_ showcase \_ static .json** . The YAML file takes the
484+ The default path if unspecified is ** config/hdx_TYPE_static .yaml** for YAML and
485+ ** config/hdx_TYPE_static .json** for JSON where TYPE is an HDX object's type like
486+ dataset or resource eg. ** config/hdx_showcase_static .json** . The YAML file takes the
470487following form:
471488
472489 owner_org: "acled"
@@ -485,35 +502,37 @@ Notice how you can define resources (each resource starts with a dash "-") withi
485502file as shown above.
486503
487504You can check if all the fields required by HDX are populated by
488- calling ** check \_ required \_ fields ** . This will throw an exception if any fields are
505+ calling ** check_required_fields ** . This will throw an exception if any fields are
489506missing. Before the library posts data to HDX, it will call this method automatically.
490507You can provide a list of fields to ignore in the check. An example usage:
491508
492509 resource.check_required_fields([ignore_fields])
493510
494511Once the HDX object is ready ie. it has all the required metadata, you simply
495- call ** create \_ in \_ hdx ** eg.
512+ call ** create_in_hdx ** eg.
496513
497514 dataset.create_in_hdx(allow_no_resources, update_resources,
498515 update_resources_by_name,
499516 remove_additional_resources)
500517
501- Existing HDX objects can be updated by calling ** update\_ in\_ hdx** eg.
518+ If the object already exists, it will be updated. You can also update
519+ explicitly by calling ** update_in_hdx** eg.
502520
503521 dataset.update_in_hdx(update_resources, update_resources_by_name,
504522 remove_additional_resources)
505523
506- You can delete HDX objects using ** delete\_ from\_ hdx** and update an object that
507- already exists in HDX with the method ** update\_ in\_ hdx** . These take various boolean
508- parameters that all have defaults and are documented in the API docs. They do not return
509- anything and they throw exceptions for failures like the object to update not existing.
524+ You can delete HDX objects using ** delete_from_hdx** and update an object that
525+ already exists in HDX with the method ** update_in_hdx** . These take various
526+ boolean parameters that all have defaults and are documented in the API docs.
527+ They do not return anything and they throw exceptions for failures like the
528+ object to update not existing.
510529
511530## Dataset Specific Operations
512531
513532A dataset can have resources and can be in a showcase.
514533
515534If you wish to add resources, you can supply a list and call
516- the ** add \_ update \_ resources ** function, for example:
535+ the ** add_update_resources ** function, for example:
517536
518537 resources = [{
519538 "name": xlsx_resourcename,
@@ -528,27 +547,27 @@ the **add\_update\_resources** function, for example:
528547 resource["description"] = resource["url"].rsplit("/", 1)[-1]
529548 dataset.add_update_resources(resources)
530549
531- Calling ** add \_ update \_ resources ** creates a list of HDX Resource objects in
550+ Calling ** add_update_resources ** creates a list of HDX Resource objects in
532551dataset and operations can be performed on those objects.
533552
534- To see the list of resources, you use the ** get \_ resources ** function eg.
553+ To see the list of resources, you use the ** get_resources ** function eg.
535554
536555 resources = dataset.get_resources()
537556
538557If you wish to add one resource, you can supply an id string, dictionary or Resource
539- object and call the ** add \_ update \_ resource ** \ * function, for example:
558+ object and call the ** add_update_resource * * function, for example:
540559
541560 dataset.add_update_resource(resource)
542561
543- You can delete a Resource object from the dataset using the ** delete \_ resource ** function, for example:
562+ You can delete a Resource object from the dataset using the ** delete_resource ** function, for example:
544563
545564 dataset.delete_resource(resource)
546565
547566You can get all the resources from a list of datasets as follows:
548567
549568 resources = Dataset.get_all_resources(datasets)
550569
551- To see the list of showcases a dataset is in, you use the ** get \_ showcases ** function eg.
570+ To see the list of showcases a dataset is in, you use the ** get_showcases ** function eg.
552571
553572 showcases = dataset.get_showcases()
554573
@@ -562,12 +581,12 @@ If you wish to add the dataset to a showcase, you must first create the showcase
562581 "url": "http://visualisation/url/"})
563582 showcase.create_in_hdx()
564583
565- Then you can supply an id, dictionary or Showcase object and call the ** add \_ showcase **
584+ Then you can supply an id, dictionary or Showcase object and call the ** add_showcase **
566585function, for example:
567586
568587 dataset.add_showcase(showcase)
569588
570- You can remove the dataset from a showcase using the ** remove \_ showcase ** function, for
589+ You can remove the dataset from a showcase using the ** remove_showcase ** function, for
571590example:
572591
573592 dataset.remove_showcase(showcase)
@@ -678,7 +697,7 @@ occur if a valid region name is supplied.
678697
679698 dataset.add_region_location("M49 REGION CODE")
680699
681- ** add \_ region \_ location ** accepts regions, intermediate regions or subregions as
700+ ** add_region_location ** accepts regions, intermediate regions or subregions as
682701specified on the
683702[ UNStats M49] ( https://unstats.un.org/unsd/methodology/m49/overview/ ) website.
684703
@@ -875,7 +894,7 @@ You can download a resource using the **download** function eg.
875894
876895 url, path = resource.download("FOLDER_TO_DOWNLOAD_TO")
877896
878- If you do not supply ** FOLDER \_ TO \_ DOWNLOAD \_ TO ** , then a temporary folder is used.
897+ If you do not supply ** FOLDER_TO_DOWNLOAD_TO ** , then a temporary folder is used.
879898
880899Before creating or updating a resource, it is possible to specify the path to a local
881900file to upload to the HDX filestore if that is preferred over hosting the file
@@ -889,16 +908,29 @@ There is a getter to read the value back:
889908
890909 file_to_upload = resource.get_file_to_upload()
891910
911+ To indicate that the data in an external resource (given by a URL) has been
912+ updated, call ** mark_data_updated** on the resource, before calling
913+ ** create_in_hdx** or ** update_in_hdx** on the dataset which will result in the
914+ resource ` last_modified ` field being set to now. Alternatively, when calling
915+ ** create_in_hdx** or ** update_in_hdx** on the resource, it is possible to
916+ supply the parameter ` data_updated ` eg.
917+
918+ resource.update_in_hdx(data_updated=True)
919+
920+ If the method ** set_file_to_upload** is used to supply a file, the resource
921+ ` last_modified ` field is set to now automatically regardless of the value of
922+ ` data_updated ` .
923+
892924## Showcase Management
893925
894926The ** Showcase** class enables you to manage showcases, creating, deleting and updating
895927(as for other HDX objects) according to your permissions.
896928
897- To see the list of datasets a showcase is in, you use the ** get \_ datasets ** function eg.
929+ To see the list of datasets a showcase is in, you use the ** get_datasets ** function eg.
898930
899931 datasets = showcase.get_datasets()
900932
901- If you wish to add a dataset to a showcase, you call the ** add \_ dataset ** function, for
933+ If you wish to add a dataset to a showcase, you call the ** add_dataset ** function, for
902934example:
903935
904936 showcase.add_dataset(dataset)
@@ -1028,10 +1060,10 @@ Next create a file called **run.py** and copy into it the code below.
10281060 facade(main, hdx_site="test")
10291061
10301062The above file will create in HDX a dataset generated by a function called
1031- ** generate \_ dataset ** that can be found in the file ** my \_ code .py** which we will now
1063+ ** generate_dataset ** that can be found in the file ** my_code .py** which we will now
10321064write.
10331065
1034- Create a file ** my \_ code .py** and copy into it the code below:
1066+ Create a file ** my_code .py** and copy into it the code below:
10351067
10361068 #!/usr/bin/python
10371069 # -*- coding: utf-8 -*-
@@ -1050,7 +1082,7 @@ Create a file **my\_code.py** and copy into it the code below:
10501082 """
10511083 logger.debug("Generating dataset!")
10521084
1053- You can then fill out the function ** generate \_ dataset ** as required.
1085+ You can then fill out the function ** generate_dataset ** as required.
10541086
10551087# IDMC Example
10561088
@@ -1061,7 +1093,7 @@ folder. If you run it unchanged, it will overwrite the existing datasets in the
10611093organisation! Therefore, you should run it against a test server. If you use it as a
10621094basis for your code, you will need to modify the dataset ** name** in ** idmc.py** and
10631095change the organisation information to your organisation. Also update metadata in
1064- ** config/hdx \_ dataset \_ static .yaml** appropriately.
1096+ ** config/hdx_dataset_static .yaml** appropriately.
10651097
10661098The IDMC scraper creates a dataset per country in HDX, populating all the required
10671099metadata. It then creates resources with files held on the HDX filestore.
0 commit comments