gjbex
diff --git a/‎source_code/presentation/data-formats/.gitignore‎
Lines changed: 1 addition & 0 deletions b/‎source_code/presentation/data-formats/.gitignore‎
Lines changed: 1 addition & 0 deletions
diff --git a/‎source_code/presentation/data-formats/Makefile‎
Lines changed: 13 additions & 0 deletions b/‎source_code/presentation/data-formats/Makefile‎
Lines changed: 13 additions & 0 deletions
diff --git a/‎source_code/presentation/data-formats/README.md‎
Lines changed: 101 additions & 0 deletions b/‎source_code/presentation/data-formats/README.md‎
Lines changed: 101 additions & 0 deletions
diff --git a/‎source_code/presentation/data-formats/Vcd/README.md‎
Lines changed: 7 additions & 0 deletions b/‎source_code/presentation/data-formats/Vcd/README.md‎
Lines changed: 7 additions & 0 deletions
@@ -0,0 +1 @@
+*.exe
@@ -0,0 +1,13 @@
+CC = gcc
+CFLAGS = -O2 -g -Wall -Wextra
+LDLIBS = -lm
+
+all: write_doubles.exe read_doubles.exe variable_length_arrays.exe \
+     read_bin_records.exe
+
+%.exe: %.o
+	$(CC) $(CFLAGS) -o $@ $< $(LDLIBS)
+
+clean:
+	$(RM) $(wildcard *.o) $(wildcard *.exe)
+	$(RM) core $(wildcard core.*)
@@ -0,0 +1,101 @@
+# Data formats
+Python can handle many data formats "out of the box" using its standard
+library.  How to read/write CSV and XML are illustrated, as well as how
+to read binary data generated by, e.g., C programs.
+
+## What is it?
+
+1. Binary data
+
+  * `write_doubles.c`: C code to write a sequence of `double` to a binary
+    file (square roots of integers).
+  * `read_doubles.c`: C code to read a binary file containing a sequence
+    of `double`, and print those in ASCII representation to verify the
+    contents of the binary file.
+  * `Makefile`: to build the C programs.
+  * `read_doubles.py`: reads sequences of 8 bytes, unpacks them into
+    a Python variable, and prints them in ASCII to standard out.
+  * `doubles.bin`: binary file.
+  * `variable_length_arrays.c`: C code to write a number of variable
+    length arrays as binary data.  The length of each array is given as
+    a four-byte unsigned integer, and is followed by eight-byte little
+    endian encoded double precision floating point values.
+  * `read_variable_length_array.py`: Python script to read and print
+    variable length arrays.
+  * `write_bin_records.py`: Python script that writes a record consisting
+    of a fixed length string and an integer.
+  * `read_bin_records.py`: C application that reads a record consisting
+    of a fixed length string and an integer.
+
+1. CSV files
+
+  * `write_csv.py`: uses the standard library `csv` module to create
+    a CSV file with four columns and five rows.
+  * `read_csv.py`: reads a CSV file (e.g., `data.csv`) that has two
+    columns, `name` and `weight` and prints the values to standard output.
+    It uses the CSV `sniffer()` function to detect the CSV dialect.
+  * `data.csv`: example file to use with `read_csv.py`.
+  * `read_commented_csv.py`: illustrates some fiddling with files that
+    are not truly CSV since they have a comment header
+  * `data_commented_tabs.csv`: tab separated CSV file
+  * `data_commented_commas.csv`: comma separated CSV file
+  * `data_commented_semicolon.csv`: semicolon separated CSV file
+  * `read_csv_rows.py`: illustrates the default CSV reader
+  * `agt_parser.py`: parser for a file format that is partially CSV.
+  * `agt_data`: data files to parse using `agt_parser.py`.
+
+1. JSON files
+
+  * `average_age.py`: computes the average age of "people" stored in a JSON
+    file.
+  * `average_age_functional.py`: computes the average age of "people"
+    stored in a JSON file in functional style.
+  * `people.json`: JSON file containing personal information.
+
+1. XML files
+
+  * `write_xml.py`: creates XML that has a root-level `blocks` element,
+    containing `block` elements that are named (by attribute), and can
+    have `item` elements, where the latter contain a text element.
+    Use the `-h` options to see how to specify parameters.
+    XML is generated using the `xml.minidom` module in the standard
+    library.
+  * `read_xml.py`: reads an XML file in the format described above,
+    and writes each item, preceeded by its block's name to standard
+    output.  However, this program can also deal with nested blocks, i.e.,
+    an XML file where a `block` element can contain another `block`
+    element.
+    The SAX parser in the `xml.sax` module is used for parsing the XML.
+    A `ContentHandler` is implemented, and a `Block` class is used for
+    data representation.
+  * `blocks.xml`: example XML file.
+  * `nested_blocks.xml`: example XML file containng nested block elements.
+
+1. Text as binary
+
+  * `line_indexer.py`: indexes a text file, i.e., it produces a CSV file
+    with two columns, the first the file position of the start of each line
+    in the text file, the second the length of that line, line endings
+    exclusive. The text file is read in binary mode.
+  * `index.txt`: example file to index.
+  * `read_line_index.py`: test script that takes a text file as input,
+    a file position and a line length, and prints the characters read
+    to standard output for verification, quoted by '|'.
+
+1. CSV, HDF5 and JSON combination
+
+  For more (and better examples of reading and writing, see
+  the [DataStorage](https://github.com/gjbex/training-material/tree/master/DataStorage/Hdf5) section.
+  * `data_generator.py`: this script will generate numerical data (integer
+    and floating point) using specified random number distributions.  The
+    column names, types the distributions, and the parameters they require
+    are specified in a JSON file that is read.  The generated data is
+    written to a file in either CSV or HDF5 format.
+  * `mixed_data.json`: example JSON definition file for the data table to
+    be generated.
+
+1. `VCD` files
+
+## Other formats
+
+Example code for using NetCDF can be found in the [DataFormats](https://github.com/gjbex/training-material/tree/master/DataStorage/NetCDF) section.
@@ -0,0 +1,7 @@
+# Vcd
+Parser for a particular subset of Verilog VCD (Value Change Dump) files.
+
+## What is it?
+1. `vcd_parser.py`: Python script/module that parsers a VCD files, and
+    creates a data structure to handle the data.
+1. `TST_GPIO_1.vcd`, `TST_UC1.vcd`: sample VCD files