A tree, whether a binary search tree (BST), radix tree, red-black tree (RBT), or any other tree imagineable, allowing for multiple reads/writes/deletes functionalities for parallism. Ideally, all reads are lock-free, meaning any read threads should not block. These read threads should always see a consistent version of the tree, and should never block writing threads. As for the writing threads, they should block other writing threads, but not any reading threads. Writing threads blocking writing threads ensures consistency when modifying the tree. Additional info and for BST
Several concurrent algorithms with an elimination: SGL stack and queue, Treiber stack, M&S stack, baskets queue, and elimination SGL/Treiber stacks.
A series of code execution that only commits all if the entire run is completed successfully. Othwerise, it will abort (fail) and roll back. Doing this in parallel seems odd, being that it may be need to wait on one (sequentially) function to run before another. This seems extremely useful for database management. More importantly, there seems that there may be some additional troubleshooting for transactions that may not be discovered until further implemented. Additional read
For the sake of resources available, and class slides, concurrent containers seemed to have the greatest amount of references. As discovered, these are added to the resources section at the end of this document.
Lock-based stack that uses a single global lock, which can decrease performance time when scaling.
Lock-based queue that also uses a single global local, again at a degradation issue with increasing complexity.
The Treiber stack algorithm is a scalable lock-free stack utilizing the fine-grained concurrency primitive compare-and-swap. It is believed that R. Kent Treiber The Treiber stack algorithm is a scalable lock-free stack utilizing the fine-grained concurrency primitive compare-and-swap. It is believed that R. Kent Treiber was the first to publish it in his 1986 article "Systems Programming: Coping with Parallelism". However, treiber stack is prone to bottlenecking and thus offers little scalability.
Non-blocking, dynamic-memory algorithm that whose threads reserve blocks beforehand, so that other threads won't reclaim before it finishes. This comes increased performance overhead.
A concurrent lock-free dynamic linearizable queue algorithm. The throughput of this is greater than the previous ones listed - as it created a mixed order (baskets) of items, instead of the traditional ordered list. Operations within each basket can be performed in parallel.
Concurrent stack algorithm that is lock-free (robust), parallel (scalable), and adaptive (performs well at low loads). Additionally, it is linearizable and combine modularly with other algorithms. Notes from paper: A push followed by a pop does not change a data structure's state. If the two threads can exchange these values without having to touch a centralized structure, they have 'eliminated' the push->pop effect. The idea: use a single elimination array as a backoff scheme on a shared lock-free stack. If the thread fails on the stack, it attempts to eliminate on the array, and if they fail at elimination, they attempt to access the stack again (and so on).
Global arrays:
- location[1..n] has an element per thread, and pointer location to thread
- collision[1..size] holds ids of the threads trying to collide Vars:
- p, q - thread id Order of eliminate stack:
- tries to perform operation on central stack
- if that fails, it goes through the collision layer
- writes thread info to location array
- chooses random location in collision array
- read random location and store it, while trying to write own id
- will keep trying 2c until successful
Wasn't able to find much info on implementing this. FC resources: FC stack, FC Queue FC and trade-off FC chat
Makefile- used to create and remove executable C++11 objectscontainer.cpp- main filecontainer.h- header file for main, includes testingarg_parser.h- parsing and error handling for mysort program input optionssgl_stack.h- functionality for stack using a single global locksgl_queue.h- functionality for queue using a single global locktreiber_stack.h- functionality for treiber stackms_queue.h- functionality for Michael and Scott stackbaskets_queue.h- functionality for Baskets queueelim_stack.h- added functionality for stacks above, using threads to exchange values through eliminationREADME.pdf- write-up for project (this file)
randomNumberGen.cpp- creates a text file with a specified number of random numbers from [ 1 - number size ]. Default number is 10,000 unless specified when ran (as second argument).pthread_add.h- threading if compiling on macOS (will still run as normal for linux systems)'
Note: If zipped, first unzip file before proceeding.
- From root directory, run
makefrom the terminal. This will generate a program called counter and mysort - Next, follow execution syntax below for either counter or mysort.
Make cleanwill remove all files generated bymake.- Additionally,
make numGenwill generate the executable object to generate random numbers to a .txt file (set to source.txt).
> containers [--name]containers --name: prints name to console.
> containers [-h]containers -h: prints help instructions to console.
> containers [--test]containers --test: visual tests using a vector with 15 integers against 1, 3, and 5 threads for each algorithm.
OR
> containers <input_file> [-t NUM THREADS] --algorithm=<sgl_stack,sgl_queue,treiber_stack,ms_queue,baskets_queue, elim_sgl_stack, elim_t_stack> <--test>input_file: name of file with ints-t NUM_THREADS: specify how many threads to use during execution (including master thread)--container=<sgl_stack,sgl_queue,treiber_stack,ms_queue,baskets_queue,elim_sgl_stack,elim_t_stack>: type of algorithm to use
- Additional outputs: time of execution in nanoseconds
Example of program execution, using integers from source.txt file, 5 threads, the sgl_stack algorithm, and display the output
> ./containers source.txt -t 5 --algorithm=sgl_stack --test> perf stat -e L1-dcache-loads -e L1-dcache-load-misses ./containers source.txt -t 5 --algorithm=<sgl_stack,sgl_queue,treiber_stack,ms_queue,baskets_queue, elim_sgl_stack, elim_t_stack>> perf stat -e branch-loads -e branch-load-misses ./containers source.txt -t 5 --algorithm=<sgl_stack,sgl_queue,treiber_stack,ms_queue,baskets_queue, elim_sgl_stack, elim_t_stack>> perf stat -e page-faults ./containers source.txt -t 5 --algorithm=<sgl_stack,sgl_queue,treiber_stack,ms_queue,baskets_queue, elim_sgl_stack, elim_t_stack>> perf stat --repeat 10 -e L1-dcache-loads,L1-dcache-load-misses,branch-loads,branch-load-misses,page-faults ./containers source.txt -t 5 --algorithm=<sgl_stack,sgl_queue,treiber_stack,ms_queue,baskets_queue, elim_sgl_stack, elim_t_stack>| Algorithm | Run Time (s) | L1 cache hit rate | branch-prediction hit rate | page-fault |
|---|---|---|---|---|
| SGL Stack | 0.002508 | 96.94% | 98.77% | 164 |
| SGL Queue | 0.002110 | 96.93% | 98.76% | 164 |
| Treiber Stack | 0.001281 | 97.68% | 99.05% | 232 |
| MS Queue | 0.002370 | 97.46% | 98.97% | 232 |
| Baskets Queue | 0.00 | % | % | |
| Elim SGL Stack | 0.001309 | 97.31% | 98.97% | 272 |
| Elim Treiber Stack | 0.001234 | 97.65% | 99.07% | 272 |
Overall among all of the algorithm's performance tests, the baskets queue algorithms seem to be the fastest, with the treiber stack in second. However, all have very similar hit rates. Ideally, this run times make sense, as the single global locks should perform much slower than the other algorithms since these are sequential operations; each process has to wait for one another before proceeding on. The stacks with elimination (as read, but not implemented) should outperform the SGL and Treiber stacks.
Testing output; output may vary on updates.
================ STARTING TEST CASES ================
---------------- Testing with 1 Thread ----------------
With Nums: { 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 }
β’β’β’β’β’β’ SGL Stack Tests β’β’β’β’β’β’
Pushed: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
Popped: 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1
Time elapsed: 0.000117 seconds
β’β’β’β’β’β’ SGL Queue Tests β’β’β’β’β’β’
Enqueued: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
Dequeued: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
Time elapsed: 0.000060 seconds
β’β’β’β’β’β’ Treiber Stack Tests β’β’β’β’β’β’
Pushed: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
Popped: 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1
Time elapsed: 0.000061 seconds
β’β’β’β’β’β’ Michael And Scott Queue Tests β’β’β’β’β’β’
Enqueued: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
Dequeued: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
Time elapsed: 0.000043 seconds
β’β’β’β’β’β’ Baskets Queue Tests β’β’β’β’β’β’
Enqueued: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
Dequeued: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
Time elapsed: 0.039010 seconds
β’β’β’β’β’β’ Elimination SGL Stack Tests β’β’β’β’β’β’
Pushed: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
Popped: 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1
Time elapsed: 0.000054 seconds
β’β’β’β’β’β’ Elimination Treiber Stack Tests β’β’β’β’β’β’
Pushed: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
Popped: 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1
Time elapsed: 0.000066 seconds
---------------- Testing with 3 Threads ----------------
With Nums: { 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 }
β’β’β’β’β’β’ SGL Stack Tests β’β’β’β’β’β’
Pushed: 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30
Popped: 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16
Time elapsed: 0.000357 seconds
β’β’β’β’β’β’ SGL Queue Tests β’β’β’β’β’β’
Enqueued: 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30
Dequeued: 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30
Time elapsed: 0.000340 seconds
β’β’β’β’β’β’ Treiber Stack Tests β’β’β’β’β’β’
Pushed: 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30
Popped: 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16
Time elapsed: 0.000117 seconds
β’β’β’β’β’β’ Michael And Scott Queue Tests β’β’β’β’β’β’
Enqueued: 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30
Dequeued: 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30
Time elapsed: 0.000209 seconds
β’β’β’β’β’β’ Baskets Queue Tests β’β’β’β’β’β’
Enqueued: 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30
Dequeued: 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30
Time elapsed: 0.000103 seconds
β’β’β’β’β’β’ Elimination SGL Stack Tests β’β’β’β’β’β’
Pushed: 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30
Popped: 30 29 28 27 25 24 23 22 26 21 20 19 18 17 16
Time elapsed: 0.000150 seconds
β’β’β’β’β’β’ Elimination Treiber Stack Tests β’β’β’β’β’β’
Pushed: 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30
Popped: 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16
Time elapsed: 0.000095 seconds
---------------- Testing with 5 Threads ----------------
With Nums: { 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 }
β’β’β’β’β’β’ SGL Stack Tests β’β’β’β’β’β’
Pushed: 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45
Popped: 45 44 43 42 41 40 39 38 37 36 35 34 33 32 31
Time elapsed: 0.000355 seconds
β’β’β’β’β’β’ SGL Queue Tests β’β’β’β’β’β’
Enqueued: 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45
Dequeued: 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45
Time elapsed: 0.000144 seconds
β’β’β’β’β’β’ Treiber Stack Tests β’β’β’β’β’β’
Pushed: 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45
Popped: 45 44 43 42 41 40 39 38 37 36 35 34 33 32 31
Time elapsed: 0.000138 seconds
β’β’β’β’β’β’ Michael And Scott Queue Tests β’β’β’β’β’β’
Enqueued: 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45
Dequeued: 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45
Time elapsed: 0.000125 seconds
β’β’β’β’β’β’ Baskets Queue Tests β’β’β’β’β’β’
Enqueued: 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45
Dequeued: 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45
Time elapsed: 0.000120 seconds
β’β’β’β’β’β’ Elimination SGL Stack Tests β’β’β’β’β’β’
Pushed: 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45
Popped: 45 44 43 42 41 40 39 38 37 36 35 34 33 32 31
Time elapsed: 0.000131 seconds
β’β’β’β’β’β’ Elimination Treiber Stack Tests β’β’β’β’β’β’
Pushed: 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45
Popped: 45 44 43 42 41 40 39 38 37 36 35 34 33 32 31
Time elapsed: 0.000123 seconds
================ END TEST CASES ================- Input/output files are loaded/saved as is. Does not check for txt file types or if data is integers.
- Creating/Joining of threads is out of order, so sometimes fails pushing/pushing OR enqueuing/dequeuing order.
- *Scalable Lock-free Stack Algorithm
- Flat Combining the Synchronization-Parallelism Tradeoff
- The Baskets Queue
- Concurrent Queue
- A queue among three threads
- Lock-free programming
- Multi-threading
- Stack push & pop
- Queue push & pop
- Wiki Treiber Stack
- Treiber Stack
- Lock-Free Programming
- A Scalable Correct Time-Stamped Stack
- Old counted ptr header file (not currently implemented)
- cpp compare exchange
- stack overflow CAS solution for baskets queue
- nanoseconds sleep for backoff scheme
- Class notes and lectures