Skip to content

Commit 8b387bd

Browse files
committed
docs: convert md to rst
1 parent 789e508 commit 8b387bd

6 files changed

Lines changed: 225 additions & 181 deletions

File tree

README.md

Lines changed: 0 additions & 93 deletions
This file was deleted.

README.rst

Lines changed: 119 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,119 @@
1+
ulist
2+
=====
3+
4+
|PyPI| |License| |CI| |doc| |publish| |code style| |Coverage|
5+
6+
`Documentation <https://tushushu.github.io/ulist/>`__ \| `Source
7+
Code <https://github.com/tushushu/ulist>`__
8+
9+
What
10+
~~~~
11+
12+
| Ulist is an ultra fast list/array data structure written in Rust with
13+
Python bindings. It aims to be the fundamental package for processing
14+
and computing 1-D list/array in Python.
15+
| It provides:
16+
17+
- an efficient, flexible and expressive 1-D list/array object;
18+
- broadcasting methods;
19+
- a SQL-like and method-chaining programming experience;
20+
21+
Performance
22+
~~~~~~~~~~~
23+
24+
| Ulist is extremly fast, and even compared with libraries like Numpy.
25+
It is
26+
- more efficient on the ``string`` and ``boolean`` array,
27+
- same level efficient on the ``integer`` array,
28+
- and a bit slower on the ``floating`` array.
29+
30+
Faster than Numpy is not the target of writing this repo, because they are just two different libraries. Ulist is more focused on general domain rather than just data science/machine learning/AI, for example the Linear Algebra Computation is not provided. But if you are curious about the performance, please see the `benchmarking results <https://github.com/tushushu/ulist/blob/main/benchmark.md>`__.
31+
32+
Requirements
33+
~~~~~~~~~~~~
34+
35+
- Python: 3.7+
36+
- OS: Linux, MacOS and Windows
37+
38+
Installation
39+
~~~~~~~~~~~~
40+
41+
Run ``pip install ulist``
42+
43+
Examples
44+
~~~~~~~~
45+
46+
Count the number of items in bins.
47+
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
48+
49+
Given an array ``arr``, count the number of items in bins [0, 3), [3, 6), [6, 9) and [9, +inf). The ``result`` is a Python dictionary with bin names as keys and numbers as values.
50+
51+
.. code:: python
52+
53+
>>> import ulist as ul
54+
55+
>>> arr = ul.arange(12)
56+
>>> arr
57+
UltraFastList([0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11])
58+
59+
>>> result = arr.case(default='[9, +inf)')\
60+
... .when(lambda x: x < 3, then='[0, 3)')\
61+
... .when(lambda x: x < 6, then='[3, 6)')\
62+
... .when(lambda x: x < 9, then='[6, 9)')\
63+
... .end()\
64+
... .counter()
65+
>>> result
66+
{'[3, 6)': 3, '[9, +inf)': 3, '[6, 9)': 3, '[0, 3)': 3}
67+
68+
Dot product.
69+
^^^^^^^^^^^^
70+
71+
Given two 1-D arrays and calculate the dot product result of those arrays.
72+
73+
.. code:: python
74+
75+
>>> import ulist as ul
76+
77+
>>> arr = ul.from_seq(range(1, 4), dtype='float')
78+
>>> arr
79+
UltraFastList([1.0, 2.0, 3.0])
80+
81+
>>> result = arr.mul(arr).sum()
82+
>>> result
83+
14.0
84+
85+
Rate of adults.
86+
^^^^^^^^^^^^^^^
87+
88+
Given the ages of people as ``arr``, and suppose the adults are equal or above 18. Clean the data by removing abnormal values and then calculate the rate of adults.
89+
90+
.. code:: python
91+
92+
>>> import ulist as ul
93+
94+
>>> arr = ul.from_seq([-1, 10, 15, 20, 30, 50, 70, 80, 100, 200], dtype='int')
95+
>>> result = arr.where(lambda x: (x >= 0) & (x < 120))\
96+
... .apply(lambda x: x >= 18)\
97+
... .mean()
98+
>>> result
99+
0.75
100+
101+
Contribute
102+
~~~~~~~~~~
103+
104+
All contributions are welcome. See `Developer Guide <https://github.com/tushushu/ulist/blob/main/develop.md>`__
105+
106+
.. |PyPI| image:: https://badge.fury.io/py/ulist.svg
107+
:target: https://pypi.org/project/ulist/
108+
.. |License| image:: https://img.shields.io/github/license/tushushu/ulist
109+
:target: https://github.com/tushushu/ulist/blob/main/LICENSE
110+
.. |CI| image:: https://github.com/tushushu/ulist/actions/workflows/main.yml/badge.svg?branch=0.9.0
111+
:target: https://github.com/tushushu/ulist/actions/workflows/main.yml
112+
.. |doc| image:: https://github.com/tushushu/ulist/actions/workflows/sphinx.yml/badge.svg?branch=0.9.0
113+
:target: https://github.com/tushushu/ulist/actions/workflows/sphinx.yml
114+
.. |publish| image:: https://github.com/tushushu/ulist/actions/workflows/publish.yml/badge.svg?branch=0.9.0
115+
:target: https://github.com/tushushu/ulist/actions/workflows/publish.yml
116+
.. |code style| image:: https://img.shields.io/badge/style-flake8-blue
117+
:target: https://github.com/PyCQA/flake8
118+
.. |Coverage| image:: https://img.shields.io/endpoint?url=https://gist.githubusercontent.com/tushushu/3a76a8f4c0d25c24b840fe66a3cf44c1/raw/metacov.json
119+
:target: https://github.com/tushushu/ulist/actions/workflows/coverage.yml

benchmark.md

Lines changed: 0 additions & 67 deletions
This file was deleted.

benchmark.rst

Lines changed: 80 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,80 @@
1+
How do we benchmark?
2+
~~~~~~~~~~~~~~~~~~~~
3+
4+
This benchmarking task is run by Github actions on ubuntu-latest. This document would be updated every time a new version is released.
5+
6+
For each dtype like ``int``, ``float``, ``str`` and ``bool``, there would be some sub-tasks to compare the performances between ``ulist`` and ``numpy``. There are 5 rounds for each sub-task with different array sizes and number of runs:
7+
8+
- XS - array size 100, run 100K times;
9+
- S - array size 1K, run 100K times;
10+
- M - array size 10K, run 10K times;
11+
- L - array size 100K, run 1K times;
12+
- XL - array size 1M, run 100 times.
13+
14+
and the result of each round and the average result are both recorded.
15+
16+
What does the result mean?
17+
~~~~~~~~~~~~~~~~~~~~~~~~~~
18+
19+
The benchmark score would be displayed as a markdown table similar to below:
20+
21+
======== ===== ==== ==== ==== ==== ==== =======
22+
Item Dtype XS S M L XL Average
23+
======== ===== ==== ==== ==== ==== ==== =======
24+
AddOne int 0.9x 1.0x 1.0x 1.0x 1.1x 1.0x
25+
ArraySum int 4.8x 6.2x 7.4x 6.4x 7.3x 6.4x
26+
EqualOne int 1.3x 1.3x 1.0x 0.9x 0.8x 1.1x
27+
======== ===== ==== ==== ==== ==== ==== =======
28+
29+
Item - The task to compare the performances.
30+
Dtype - The array element type.
31+
32+
Take the 3rd line for example, it means by running the task ``EqualOne`` with
33+
``dtype=int``, the ``ulist``\ ’s speed is 1.1 times of ``numpy`` on average.
34+
35+
Benchmark score
36+
~~~~~~~~~~~~~~~
37+
38+
| Info:
39+
40+
----
41+
42+
| Date: 2022-02-26 10:38:31
43+
| System OS: Linux
44+
| CPU: Intel(R) Xeon(R) CPU E5-2673 v3 @ 2.40GHz
45+
| Python version: 3.10.2
46+
| Ulist version: 0.8.0
47+
| Numpy version: 1.22.0
48+
49+
----
50+
51+
Result:
52+
53+
============ ====== ===== ===== ===== ===== ===== ======= ======
54+
Item Dtype XS S M L XL Average Faster
55+
============ ====== ===== ===== ===== ===== ===== ======= ======
56+
AddOne int 0.9x 1.0x 1.0x 1.0x 1.1x 1.0x N
57+
ArraySum int 6.0x 7.0x 8.4x 5.5x 7.0x 6.8x Y
58+
CountElems int 9.7x 1.7x 0.9x 0.8x 0.9x 2.8x Y
59+
EqualOne int 1.4x 1.4x 1.4x 0.9x 0.8x 1.2x Y
60+
Max int 4.4x 3.7x 3.2x 3.0x 3.2x 3.5x Y
61+
MulTwo int 1.0x 1.0x 0.8x 0.8x 0.8x 0.9x N
62+
UniqueElem int 2.7x 0.5x 0.4x 0.3x 0.3x 0.8x N
63+
Sort int 0.8x 0.6x 0.9x 0.9x 0.9x 0.8x N
64+
AddOne float 1.0x 1.2x 1.2x 1.1x 1.1x 1.1x Y
65+
ArraySum float 4.0x 2.0x 0.7x 0.4x 0.4x 1.5x Y
66+
LessThanOne float 1.2x 1.2x 0.9x 0.8x 1.0x 1.0x N
67+
Max float 2.9x 1.1x 0.2x 0.1x 0.1x 0.9x N
68+
MulTwo float 1.0x 1.0x 1.1x 1.0x 1.0x 1.0x N
69+
Sort float 0.9x 0.6x 0.7x 0.7x 0.7x 0.7x N
70+
AllIsTrue bool 5.5x 3.4x 1.2x 0.7x 0.6x 2.3x Y
71+
AndOp bool 0.5x 0.8x 1.3x 4.3x 3.8x 2.1x Y
72+
AnyIsTrue bool 5.4x 3.4x 1.2x 0.7x 0.6x 2.3x Y
73+
NotOp bool 0.6x 0.9x 1.5x 4.9x 4.5x 2.5x Y
74+
OrOp bool 0.5x 0.8x 1.4x 3.6x 3.4x 1.9x Y
75+
ContainsElem string 16.4x 20.1x 20.6x 20.7x 20.3x 19.6x Y
76+
CountElems string 4.7x 1.7x 1.5x 1.9x 2.1x 2.4x Y
77+
EqualFoo string 1.2x 2.8x 3.6x 3.9x 2.5x 2.8x Y
78+
============ ====== ===== ===== ===== ===== ===== ======= ======
79+
80+
14 of 22 tasks are faster!

develop.md

Lines changed: 0 additions & 21 deletions
This file was deleted.

0 commit comments

Comments
 (0)