Skip to content

Commit a7580a8

Browse files
committed
make release-tag: Merge branch 'master' into stable
2 parents 50c43c0 + 0ede270 commit a7580a8

17 files changed

Lines changed: 520 additions & 135 deletions

.gitignore

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -64,6 +64,7 @@ instance/
6464

6565
# Sphinx documentation
6666
docs/_build/
67+
docs/pipeline.json
6768

6869
# PyBuilder
6970
target/

HISTORY.md

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,13 @@
11
Changelog
22
=========
33

4+
0.3.0 - New Primitives Discovery
5+
--------------------------------
6+
7+
* New primitives discovery system based on `entry_points`.
8+
* Conditional Hyperparameters filtering in MLBlock initialization.
9+
* Improved logging and exception reporting.
10+
411
0.2.4 - New Datasets and Unit Tests
512
-----------------------------------
613

docs/advanced_usage/adding_primitives.rst

Lines changed: 31 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -29,7 +29,7 @@ by writing the corresponding `JSON annotation <primitives.html#json-annotations>
2929

3030
.. _MLPrimitives integrated primitives: https://github.com/HDI-Project/MLPrimitives/tree/master/mlblocks_primitives
3131

32-
.. note:: If you integrate new primitives for MLBlocks, please consider contributing them to the
32+
.. note:: If you create new primitives for MLBlocks, please consider contributing them to the
3333
**MLPrimitives** project!
3434

3535
The first thing to do when adding a new primitive is making sure that it complies with the
@@ -58,8 +58,8 @@ place known to **MLBlocks**.
5858
**MLBlocks** looks for primitives in the following folders, in this order:
5959

6060
1. Any folder specified by the user, starting by the latest one.
61-
2. A folder named `mlblocks_primitives` in the current working directory.
62-
3. A folder named `mlblocks_primitives` in the `system prefix`_.
61+
2. A folder named ``mlblocks_primitives`` or ``mlprimitives`` in the current working directory.
62+
3. A folder named ``mlblocks_primitives`` or ``mlprimitives`` in the `system prefix`_.
6363

6464
.. _system prefix: https://docs.python.org/3/library/sys.html#sys.prefix
6565

@@ -80,3 +80,31 @@ However, sometimes you will want to add a custom directory.
8080
This can be easily done by using the `mlblocks.add_primitives_path`_ method.
8181

8282
.. _mlblocks.add_primitives_path: ../api_reference.html#mlblocks.add_primitives_path
83+
84+
Developing a Primitives Library
85+
-------------------------------
86+
87+
Another option to add multiple libraries is creating a primitives library, such as
88+
`MLPrimitives`_.
89+
90+
In order to make **MLBLocks** able to find the primitives defined in such a library,
91+
all you need to do is setting up an `Entry Point`_ in your `setup.py` script with the
92+
following specification:
93+
94+
1. It has to be published under the name ``mlprimitives``.
95+
2. It has to be named exactly ``jsons_path``.
96+
3. It has to point at a variable that contains the path to the JSONS folder.
97+
98+
An example of such an entry point would be::
99+
100+
entry_points = {
101+
'mlprimitives': [
102+
'jsons_path=some_module:SOME_VARIABLE'
103+
]
104+
}
105+
106+
where the module `some_module` contains a variable such as::
107+
108+
SOME_VARIABLE = os.path.join(os.path.dirname(__file__), 'jsons')
109+
110+
.. _Entry Point: https://packaging.python.org/specifications/entry-points/

docs/advanced_usage/hyperparameters.rst

Lines changed: 18 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -165,6 +165,19 @@ Conditional Hyperparameters
165165

166166
In some other cases, the values that a hyperparameter can take depend on the value of another
167167
one.
168+
For example, sometimes a primitive has a hyperparameter that specifies a kernel, and depending
169+
on the kernel used some other hyperparameters may be or not be used, or they might be able
170+
to take only some specific values.
171+
172+
In this case, the ``type`` of the hyperparameter whose values depend on the other is specified
173+
as ``conditional``.
174+
In this case, two additional entries are required:
175+
176+
* an entry called ``condition``, which specifies the name of the other hyperparameter, the value
177+
of which is evaluated to decide which values this hyperparameter can take.
178+
* an additional subdictionary called ``values``, which relates the possible values that the
179+
`condition` hyperparameter can have with the full specifications of the type and values that
180+
this hyperparameter can take in each case.
168181

169182
Suppose, for example, that the primitive explained in the previous point does not expect
170183
the ``mean``, ``min`` or ``max`` strings as values for the ``max_features`` hyperparameter,
@@ -190,7 +203,7 @@ In this case, the hyperparameters would be annotated like this::
190203
}
191204
"max_features_aggregation": {
192205
"type": "conditional",
193-
"condition": "mas_features",
206+
"condition": "max_features",
194207
"default": null,
195208
"values": {
196209
"auto": {
@@ -202,6 +215,10 @@ In this case, the hyperparameters would be annotated like this::
202215
}
203216
}
204217

218+
.. note:: Just like a regular hyperparameter, if there is no match the default entry is used.
219+
In this example, the ``null`` value indicates that the hyperparameter needs to be
220+
disabled if there is no match, but instead of it we could add there a full specification
221+
of type, range and default value as a nested dictionary to be used by default.
205222

206223
.. _JSON Annotations: primitives.html#json-annotations
207224
.. _MLPrimitives: https://github.com/HDI-Project/MLPrimitives

docs/api/mlblocks.primitives.rst

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
mlblocks.primitives
2+
===================
3+
4+
.. automodule:: mlblocks.primitives
5+
:members:

docs/index.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -74,6 +74,7 @@ integrate with deep learning libraries.
7474

7575
api/mlblocks
7676
api/mlblocks.datasets
77+
api/mlblocks.primitives
7778

7879
.. toctree::
7980
:caption: Resources

docs/pipeline.json

Lines changed: 0 additions & 91 deletions
This file was deleted.

mlblocks/__init__.py

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -10,15 +10,15 @@
1010
* Documentation: https://HDI-Project.github.io/MLBlocks
1111
"""
1212

13-
from mlblocks.mlblock import MLBlock # noqa
14-
from mlblocks.mlpipeline import MLPipeline # noqa
15-
from mlblocks.primitives import add_primitives_path, get_primitives_paths, load_primitive # noqa
13+
from mlblocks.mlblock import MLBlock
14+
from mlblocks.mlpipeline import MLPipeline
15+
from mlblocks.primitives import add_primitives_path, get_primitives_paths, load_primitive
1616

1717
__author__ = 'MIT Data To AI Lab'
1818
__copyright__ = 'Copyright (c) 2018, MIT Data To AI Lab'
1919
__email__ = 'dailabmit@gmail.com'
2020
__license__ = 'MIT'
21-
__version__ = '0.2.4'
21+
__version__ = '0.3.0-dev'
2222

2323
__all__ = [
2424
'MLBlock', 'MLPipeline', 'add_primitives_path',

mlblocks/datasets.py

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -40,6 +40,7 @@
4040
"""
4141

4242
import io
43+
import logging
4344
import os
4445
import tarfile
4546
import urllib
@@ -52,6 +53,8 @@
5253
from sklearn.metrics import accuracy_score, normalized_mutual_info_score, r2_score
5354
from sklearn.model_selection import KFold, StratifiedKFold, train_test_split
5455

56+
LOGGER = logging.getLogger(__name__)
57+
5558
INPUT_SHAPE = [224, 224, 3]
5659

5760
DATA_PATH = os.path.join(
@@ -183,9 +186,12 @@ def get_splits(self, n_splits=1):
183186

184187
def _download(dataset_name, dataset_path):
185188
url = DATA_URL.format(dataset_name)
189+
190+
LOGGER.debug('Downloading dataset %s from %s', dataset_name, url)
186191
response = urllib.request.urlopen(url)
187192
bytes_io = io.BytesIO(response.read())
188193

194+
LOGGER.debug('Extracting dataset into %s', DATA_PATH)
189195
with tarfile.open(fileobj=bytes_io, mode='r:gz') as tf:
190196
tf.extractall(DATA_PATH)
191197

@@ -202,6 +208,7 @@ def _load(dataset_name):
202208

203209

204210
def _load_images(image_dir, filenames):
211+
LOGGER.debug('Loading %s images from %s', len(filenames), image_dir)
205212
images = []
206213
for filename in filenames:
207214
filename = os.path.join(image_dir, filename)
@@ -217,6 +224,8 @@ def _load_images(image_dir, filenames):
217224

218225
def _load_csv(dataset_path, name, set_index=False):
219226
csv_path = os.path.join(dataset_path, name + '.csv')
227+
228+
LOGGER.debug('Loading csv %s', csv_path)
220229
df = pd.read_csv(csv_path)
221230

222231
if set_index:

mlblocks/mlblock.py

Lines changed: 33 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -3,9 +3,12 @@
33
"""Package where the MLBlock class is defined."""
44

55
import importlib
6+
import logging
67

78
from mlblocks.primitives import load_primitive
89

10+
LOGGER = logging.getLogger(__name__)
11+
912

1013
def import_object(object_name):
1114
"""Import an object from its Fully Qualified Name."""
@@ -83,7 +86,7 @@ def _extract_params(self, kwargs, hyperparameters):
8386
value = param['default']
8487

8588
else:
86-
raise TypeError("Required argument '{}' not found".format(name))
89+
raise TypeError("{} required argument '{}' not found".format(self.name, name))
8790

8891
init_params[name] = value
8992

@@ -107,6 +110,33 @@ def _extract_params(self, kwargs, hyperparameters):
107110

108111
return init_params, fit_params, produce_params
109112

113+
@staticmethod
114+
def _filter_conditional(conditional, init_params):
115+
condition = conditional['condition']
116+
default = conditional.get('default')
117+
118+
if condition not in init_params:
119+
return default
120+
121+
condition_value = init_params[condition]
122+
values = conditional['values']
123+
return values.get(condition_value, default)
124+
125+
@classmethod
126+
def _get_tunable(cls, hyperparameters, init_params):
127+
tunable = dict()
128+
for name, param in hyperparameters.get('tunable', dict()).items():
129+
if name not in init_params:
130+
if param['type'] == 'conditional':
131+
param = cls._filter_conditional(param, init_params)
132+
if param is not None:
133+
tunable[name] = param
134+
135+
else:
136+
tunable[name] = param
137+
138+
return tunable
139+
110140
def __init__(self, name, **kwargs):
111141

112142
self.name = name
@@ -133,13 +163,7 @@ def __init__(self, name, **kwargs):
133163
self._fit_params = fit_params
134164
self._produce_params = produce_params
135165

136-
tunable = hyperparameters.get('tunable', dict())
137-
self._tunable = {
138-
name: param
139-
for name, param in tunable.items()
140-
if name not in init_params
141-
# TODO: filter conditionals
142-
}
166+
self._tunable = self._get_tunable(hyperparameters, init_params)
143167

144168
default = {
145169
name: param['default']
@@ -193,6 +217,7 @@ def set_hyperparameters(self, hyperparameters):
193217
self._hyperparameters.update(hyperparameters)
194218

195219
if self._class:
220+
LOGGER.debug('Creating a new primitive instance for %s', self.name)
196221
self.instance = self.primitive(**self._hyperparameters)
197222

198223
def fit(self, **kwargs):

0 commit comments

Comments
 (0)