11@Marco please move these into ` docs/further-background/wrapping-derived-types.md ` (and do any other clean up and formatting fixes you'd like)
22
3- Wrapping derived types is tricky.
4- Notably, [ f2py] (@Marco please add link) does not provide direct support for it.
3+ ## What is the goal?
4+
5+ The goal is to be able to run MAGICC, a model written in Fortran, from Python.
6+ This means we need to be able to instantiate MAGICC's inputs in memory in Python,
7+ pass them to Fortran to solve the model and get them back as results in Python.
8+
9+ Our data is not easily represented as primitive types (floats, ints, strings, arrays)
10+ because we want to have more robust data handling, e.g. attaching units to arrays.
11+ As a result, we need to pass objects to Fortran and return Fortran derived types to Python.
12+ It turns out that wrapping derived types is tricky.
13+ Notably, [ f2py] ( https://numpy.org/doc/stable/f2py/ )
14+ does not provide direct support for it.
515As a result, we need to come up with our own solution.
616
717## Our solution
818
19+ Our solution is based on a key simplifying assumption.
20+ Once we have passed data across the Python-Fortran interface,
21+ there is no way to modify it again from the other side of the interface.
22+ In other words, our wrappers are not views,
23+ instead they are independent instantiations of the same (or as similar as possible) data models.
24+
25+ For example, if I have an object in Python
26+ and I pass this to a wrapped Fortran function which alters some attribute of this object,
27+ that modification will only happen on the Fortran side,
28+ the original Python object will remain unchanged
29+ (as a note, to see the result, we must return a new Python object from the Fortran wrapper).
30+
31+ This assumption makes ownership and memory management clear.
32+ We do not need to keep instances around as views
33+ and worry about consistency across the Python-Fortran interface.
34+ Instead, we simply pass data back and forth,
35+ and the normal rules of data consistency within each programming language apply.
36+
37+ To actually pass derived types back and forth across the Python-Fortran interface,
38+ we introduce a 'manager' module for all derived types.
39+
40+ The manager module has two key components:
41+
42+ 1 . an allocatable array of instances of the derived type it manages,
43+ call this ` instance_array ` .
44+ The array of instances are instances which the manager owns.
45+ In practice, they are essentially temporary variables.
46+ 1 . an allocatable array of logical (boolean) values,
47+ call this ` available_array ` .
48+ The convention is that, if ` available_array(i) ` , where ` i ` is an integer,
49+ is ` .true. ` then the instance at ` instance_array(i) ` is available for the manager to use,
50+ otherwise the manager assumes that the instance is already being used for some purpose
51+ and therefore cannot be used for whatever operation is currently being performed.
52+
53+ This setup allows us to effectively pass derived types back and forth between Python and Fortran.
54+
55+ Whenever we need to return a derived type to Python, we:
56+
57+ [ TODO think about retrieving multiple derived types at once]
58+
59+ 1 . get the derived type from whatever Fortran function or subroutine created it,
60+ call this ` derived_type_original `
61+ 1 . find an index, ` idx ` , in ` available_array ` such that ` available_array(idx) ` is ` .true. `
62+ 1 . set ` instance_array(idx) ` equal to ` derived_type_original `
63+ 1 . we return ` idx ` to Python
64+ - ` idx ` is an integer, so we can return this easily to Python using ` f2py `
65+ 1 . we then create a Python object with an API that mirrors ` derived_type_original `
66+ using the class method ` from_instance_index ` .
67+ This class method is [ TODO or will be] auto-generated via ` pyfgen `
68+ and handles retrieval of all the attribute values of ` derived_type_original `
69+ from Fortran and sets them on the Python object that is being instantiated
70+ - we can do this as, if you dig down deep enough, all attributes eventually
71+ become primitive types which can be passed back and forth using ` f2py ` ,
72+ it can just be that multiple levels of recursion are needed
73+ if you have derived types that themselves have derived type attributes
74+ 1 . we then call the manager [ TODO I think this will end up being wrapper, we can tighten the language later]
75+ module's ` finalise_instance_index ` function to free the (temporary) instance
76+ that was used by the manager
77+ - this instance is no longer needed because all the data has been transferred to Python
78+ 1 . we end up with a Python instance that has the result
79+ and no extra/leftover memory footprint in Fortran
80+ (and leave Fortran to decide whether to clean up ` derived_type_original ` or not)
81+
82+ Whenever we need to pass a derived type to Fortran, we:
83+
84+ [ TODO think about passing multiple derived types at once]
85+
86+ 1 . call the manager [ TODO I think this will end up being wrapper, we can tighten the language later]
87+ module's ` get_free_instance_index ` function to get an available index to use for the passing
88+ 1 . call the manager [ TODO I think this will end up being wrapper, we can tighten the language later]
89+ module's ` build_instance ` function with the index we just received
90+ plus all of the Python object's attribute values
91+ - on the Fortran side, there is now an instantiated derived type, ready for use
92+ 1 . call the wrapped Fortran function of interest,
93+ except we pass the instance index instead of the derived type
94+ 1 . on the Fortran side, retrieve the instantiated index from the manager module
95+ and use this to call the Fortran function/subroutine of interest
96+ 1 . return the result from Fortran back to Python
97+ 1 . call the manager [ TODO I think this will end up being wrapper, we can tighten the language later]
98+ module's ` finalise_instance_index ` function to free the (temporary) instance
99+ that was used to pass the instance in the first place
100+ - this instance is no longer needed because all the data has been transferred and used by Fortran
101+ 1 . we end up with the result of the Fortran callable back in Python
102+ and no extra/leftover memory footprint in Fortran from the instance created by the manager module
103+
104+ ## Further background
105+
106+ We initially started this project and took quite a different route.
107+ The reason was that we were actually solving a different problem.
108+ What we were trying to do was to provide views into underlying Fortran instances.
109+ For example, we wanted to enable the following:
110+
111+ ``` python
112+ >> > from some_fortran_wrapper import SomeWrappedFortranDerivedType
113+
114+
115+ >> > inst = SomeWrappedFortranDerivedType(value1 = 2 , value2 = " hi" )
116+ >> > inst2 = inst
117+ >> > inst.value1 = 5
118+ >> > # Updating the view via `inst` also affects `inst2`
119+ >> > inst2.value1
120+ 5
121+ ```
122+
123+ Supporting views like this introduces a whole bunch of headaches,
124+ mainly due to consistency and memory management.
125+
126+ A first headache is consistency.
127+ Consider the following, which is a common gotcha with numpy
128+
129+ ``` python
130+ >> > import numpy as np
131+ >> >
132+ >> > a = np.array([1.2 , 2.2 , 2.5 ])
133+ >> > b = a
134+ >> > a[2 ] = 0.0
135+ >> > # b has been updated too - many users don't expect this
136+ >> > b
137+ array([1.2 , 2.2 , 0 . ])
138+ ```
139+
140+ The second is memory management.
141+ For example, in the example above, if I delete variable ` a ` ,
142+ what should variable ` b ` become?
143+
144+ With numpy, it turns out that the answer is that ` b ` is unaffected
145+
146+ ``` python
147+ >> > del a
148+ >> > a
149+ Traceback (most recent call last):
150+ File " <stdin>" , line 1 , in < module>
151+ NameError : name ' a' is not defined
152+ >> > b
153+ array([1.2 , 2.2 , 0 . ])
154+ ```
155+
156+ However, we would argue that this is not the only possibility.
157+ It could also be that ` b ` should become undefined,
158+ as the underlying array it views has been deleted.
159+ Doing it like this must also be very complicated for numpy,
160+ as they need to keep track of how many references
161+ there are to the array underlying the Python variables
162+ to know whether to actually free the memory or not.
163+
164+ We don't want to solve these headaches,
165+ which is why our solution does not support views,
166+ instead only supporting the passing of data across the Python-Fortran interface
167+ (which ensures that ownership is clear at all times
168+ and normal Python rules apply in Python
169+ (which doesn't mean there aren't gotchas, just that we won't introduce any new gotchas)).
170+
171+ ## Other solutions we rejected
172+
173+ ### Provide views rather than passing data
174+
175+ Note: this section was never properly finished.
176+ Once we started trying to write it,
177+ we realised how hard it would be to avoid weird edge cases
178+ so we stopped and changed to [ our current solution] [ Our solution ]
179+ (@Marco please check that this internal cross-reference works
180+ once the docs are built).
181+
9182To pass derived types back and forth across the Python-Fortran interface,
10183we introduce a 'manager' module for all derived types.
11184This manager module is responsible for managing derived type instances
@@ -15,14 +188,12 @@ and is needed because we can't pass them directly using f2py.
15188The manager module has two key components:
16189
171901 . an allocatable array of instances of the derived type it manages
18- (@Marco note that this isn't how it is implemented now,
19- but this is how we will end up implementing it)
201911 . an allocatable array of logical (boolean) values
21192
22193The array of instances are instances which the manager owns.
23194It holds onto these: can instantiate them, can make them have the same values
24195as results from Fortran functions etc.
25- (@ Marco I think we need to decide whether this is an array of instances
196+ (I think we need to decide whether this is an array of instances
26197or an array of pointers to instances (although I don't think that's a thing https://fortran-lang.discourse.group/t/arrays-of-pointers/4851/6 ,
27198so doing something like this might require yet another layer of abstraction).
28199Array of instances means we have to do quite some data copying
@@ -79,8 +250,6 @@ and have slow reallocation calls sometimes (when we need to increase the number
79250There is no perfect solution, and we think this way strikes the right balance of
80251'just works' for most users while also offering access to fine-grained memory control for 'power users'.
81252
82- ## Other solutions we rejected
83-
84253### Pass pointers back and forth
85254
86255Example repository: https://github.com/Nicholaswogan/f2py-with-derived-types
0 commit comments