|
4 | 4 | Arrow Support |
5 | 5 | ============= |
6 | 6 |
|
7 | | -Arrow is an in memory data exchange format that is the spritual |
8 | | -successor to the numpy array interface. It provides for zero copy |
9 | | -access to columnar data, which in our case is Image data. |
| 7 | +`Arrow <https://arrow.apache.org/>`__ |
| 8 | +is an in-memory data exchange format that is the spiritual |
| 9 | +successor to the NumPy array interface. It provides for zero-copy |
| 10 | +access to columnar data, which in our case is ``Image`` data. |
10 | 11 |
|
11 | | -The goal with Arrow is to provide native zero-copy interop with any |
12 | | -arrow provider or consumer in the Python ecosystem. |
| 12 | +The goal with Arrow is to provide native zero-copy interoperability |
| 13 | +with any Arrow provider or consumer in the Python ecosystem. |
13 | 14 |
|
14 | | -.. warning:: Zero-copy does not mean zero allocation -- The internal |
| 15 | +.. warning:: Zero-copy does not mean zero allocation -- the internal |
15 | 16 | memory layout of Pillow images contains an allocation for row |
16 | 17 | pointers, so there is a non-zero, but significantly smaller than a |
17 | | - full copy memory cost to reading an arrow image. |
| 18 | + full-copy memory cost to reading an Arrow image. |
18 | 19 |
|
19 | 20 |
|
20 | 21 | Data Formats |
21 | 22 | ============ |
22 | 23 |
|
23 | | -Pillow currently supports exporting arrow images in all modes |
| 24 | +Pillow currently supports exporting Arrow images in all modes |
24 | 25 | **except** for ``BGR;15``, ``BGR;16`` and ``BGR;24``. This is due to |
25 | | -line length packing in these modes making for non-continuous memory. |
| 26 | +line-length packing in these modes making for non-continuous memory. |
26 | 27 |
|
27 | | -For single band images, the exported array is width*height elements, |
28 | | -with each pixel corresponding to the appropriate arrow type. |
| 28 | +For single-band images, the exported array is width*height elements, |
| 29 | +with each pixel corresponding to the appropriate Arrow type. |
29 | 30 |
|
30 | | -For multiband images, the exported array is width*height fixed length |
31 | | -4 element arrays of uint8. This is memory compatible with the raw |
32 | | -image storage of 4 bytes per pixel. |
| 31 | +For multiband images, the exported array is width*height fixed-length |
| 32 | +four-element arrays of uint8. This is memory compatible with the raw |
| 33 | +image storage of four bytes per pixel. |
33 | 34 |
|
34 | | -Mode ``1`` images are exported as 1 uint8 byte/pixel, as this is |
| 35 | +Mode ``1`` images are exported as one uint8 byte/pixel, as this is |
35 | 36 | consistent with the internal storage. |
36 | 37 |
|
37 | 38 | Pillow will accept, but not produce, one other format. For any |
38 | | -multichannel image with 32 bit storage per pixel, Pillow will accept |
| 39 | +multichannel image with 32-bit storage per pixel, Pillow will accept |
39 | 40 | an array of width*height int32 elements, which will then be |
40 | | -interpreted using the mode specific interpretation of the bytes. |
| 41 | +interpreted using the mode-specific interpretation of the bytes. |
41 | 42 |
|
42 | | -The image mode must match the arrow band format when reading single |
43 | | -channel images |
| 43 | +The image mode must match the Arrow band format when reading single |
| 44 | +channel images. |
44 | 45 |
|
45 | 46 | Memory Allocator |
46 | 47 | ================ |
47 | 48 |
|
48 | 49 | Pillow's default memory allocator, the :ref:`block_allocator`, |
49 | | -allocates up to a 16MB block for images by default. Larger images |
| 50 | +allocates up to a 16 MB block for images by default. Larger images |
50 | 51 | overflow into additional blocks. Arrow requires a single continuous |
51 | 52 | memory allocation, so images allocated in multiple blocks cannot be |
52 | | -exported in the arrow format. |
| 53 | +exported in the Arrow format. |
53 | 54 |
|
54 | 55 | To enable the single block allocator:: |
55 | 56 |
|
56 | 57 | from PIL import Image |
57 | 58 | Image.core.set_use_block_allocator(1) |
58 | 59 |
|
59 | | -Note that this is a global setting, not a per image setting. |
| 60 | +Note that this is a global setting, not a per-image setting. |
60 | 61 |
|
61 | 62 | Unsupported Features |
62 | 63 | ==================== |
63 | 64 |
|
64 | | -* Table/Dataframe protocol. We currently support a single array. |
| 65 | +* Table/dataframe protocol. We support a single array. |
65 | 66 | * Null markers, producing or consuming. Null values are inferred from |
66 | 67 | the mode. e.g. RGB images are stored in the first three bytes of |
67 | | - each 32 bit pixel, and the last byte is an implied null. |
68 | | -* Schema Negotiation. There is an optional schema for the requested |
69 | | - datatype in the arrow source interface. We currently ignore that |
| 68 | + each 32-bit pixel, and the last byte is an implied null. |
| 69 | +* Schema negotiation. There is an optional schema for the requested |
| 70 | + datatype in the Arrow source interface. We ignore that |
70 | 71 | parameter. |
71 | | -* Array Metadata. |
| 72 | +* Array metadata. |
72 | 73 |
|
73 | 74 | Internal Details |
74 | 75 | ================ |
75 | 76 |
|
76 | 77 | Python Arrow C interface: |
77 | 78 | https://arrow.apache.org/docs/format/CDataInterface/PyCapsuleInterface.html |
78 | 79 |
|
79 | | -The memory that is exported from the arrow interface is shared -- not |
| 80 | +The memory that is exported from the Arrow interface is shared -- not |
80 | 81 | copied, so the lifetime of the memory allocation is no longer strictly |
81 | | -tied to the life of the python object. |
| 82 | +tied to the life of the Python object. |
82 | 83 |
|
83 | 84 | The core imaging struct now has a refcount associated with it, and the |
84 | | -lifetime of the core image struct is now divorced from the python |
| 85 | +lifetime of the core image struct is now divorced from the Python |
85 | 86 | image object. Creating an arrow reference to the image increments the |
86 | 87 | refcount, and the imaging struct is only released when the refcount |
87 | | -reaches 0. |
| 88 | +reaches zero. |
0 commit comments