Skip to content

Commit e6796ef

Browse files
authored
Merge branch 'develop' into refine-H-mw-api
2 parents 833dcfe + e2bf283 commit e6796ef

11 files changed

Lines changed: 742 additions & 30 deletions

File tree

docs/examples.rst

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -66,3 +66,11 @@ Directory Description
6666
``rsqmc_misc/graphene`` Graphene sheet DMC example including use of Nexus analyzer to obtain total energy.
6767
``rsqmc_misc/c20`` C\ :math:`_{20}` fullerene molecule using pseudopotentials and spline orbitals from Quantum ESPRESSO.
6868
======================================================= =================================================================================================================================================================
69+
70+
Beyond orchestrating QMCPACK calculations, Nexus can launch arbitrary Python or shell scripts. Therefore, custom preparation and analysis steps remain integrated in the workflow; for example, this feature can be used to post-process QMCPACK charge densities via command-line executables such as ``qdens``.
71+
72+
Directory Description
73+
======================================================= =================================================================================================================================================================
74+
``nexus/examples/generic/python_demo`` Generates an example data file and processes the dependent data file as Nexus simulation objects.
75+
``nexus/examples/generic/bash_demo`` Simple bash command that lists the contents of a directory as a Nexus simulation object.
76+
======================================================= =================================================================================================================================================================

nexus/docs/examples.rst

Lines changed: 308 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1705,3 +1705,311 @@ k-point grid density in one dimension.
17051705

17061706
run_project(scf,nscf,conv,qmc)
17071707

1708+
.. _custom-simulations:
1709+
1710+
1711+
Example 7: GenericSimulation Examples
1712+
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1713+
1714+
The files for this example are found in:
1715+
1716+
.. code:: rest
1717+
1718+
/your_download_path/nexus/examples/generic
1719+
1720+
GenericSimulation class provides support for executing custom scripts with Nexus.
1721+
It uses template files for input generation and also provides output file tracking to determine simulation completion status.
1722+
This makes it ideal for running Python scripts, bash commands, and any other executable within the Nexus framework.
1723+
GenericSimulation can execute both serial and parallel jobs, making it suitable for a wide range of computational workflows.
1724+
1725+
Key Features
1726+
~~~~~~~~~~~~
1727+
1728+
**Automatic Template Processing**: GenericSimulation automatically handles
1729+
input templates for both file paths and text content, making it easy to
1730+
pass parameters between simulations. Templates support variable substitution
1731+
using Python's ``string.Template`` syntax.
1732+
1733+
**Output File Tracking**: The system tracks specified output files to determine
1734+
simulation completion status, eliminating the need for manual completion files.
1735+
1736+
**Flexible Job Configuration**: Supports both serial and parallel execution
1737+
through standard job configuration with automatic command generation.
1738+
1739+
**Template Substitution**: Automatic replacement of template variables like
1740+
``${output}`` with actual dependency paths.
1741+
1742+
**Extensibility**: Any executable or script can be integrated into Nexus
1743+
workflows using a template and dependency management system.
1744+
1745+
Available Examples
1746+
~~~~~~~~~~~~~~~~~~
1747+
1748+
The generic examples directory contains several demonstration scripts:
1749+
1750+
**bash_demo.py**: Demonstrates bash command execution using Python subprocess
1751+
- Shows how to run bash commands
1752+
- Uses template substitution for dependency handling
1753+
- Simple and easy to understand
1754+
1755+
**python_demo.py**: Demonstrates pure Python data processing
1756+
- Uses numpy for data generation and analysis
1757+
- Shows template-based dependency management
1758+
1759+
Basic Usage
1760+
~~~~~~~~~~~
1761+
1762+
The simplest way to use GenericSimulation is to create a simulation with
1763+
a script file. Here's a simplified version of the actual examples:
1764+
1765+
**Python Data Generation Example** (from ``python_demo.py``):
1766+
1767+
.. code:: python
1768+
1769+
from nexus import generate_simulation, job, run_project
1770+
1771+
# Create a data generator simulation
1772+
data_generator = generate_simulation(
1773+
identifier='data_generator',
1774+
path='data_generator',
1775+
job=job(serial=True, app='python3'),
1776+
input='scripts/data_generator.py', # Script file path
1777+
outfiles=['data_generation_complete.txt'] # Expected output files
1778+
)
1779+
1780+
# Run the simulation
1781+
run_project(data_generator)
1782+
1783+
**Job Configuration Options**:
1784+
1785+
GenericSimulation supports both serial and parallel execution.
1786+
Assuming a suitable script is provided in ``generate_simulation`` function, following options are available:
1787+
1788+
.. code:: python
1789+
1790+
# Serial execution
1791+
job(serial=True, app='python3') # python3 my_script.py
1792+
1793+
# Parallel execution (uses machine settings)
1794+
job(app='python3', cores=4) # mpirun -np 4 python3 my_script.py
1795+
1796+
# Custom executable
1797+
job(serial=True, app='/path/to/custom/executable') # /path/to/custom/executable my_script
1798+
1799+
# Bash script
1800+
job(serial=True, app='bash') # bash my_script.sh
1801+
1802+
**Bash Command Example** (from ``bash_demo.py``):
1803+
1804+
.. code:: python
1805+
1806+
from nexus import generate_simulation, input_template, job, run_project
1807+
1808+
# Create bash command simulation with template
1809+
input_template = input_template(filepath='scripts/list_directory.sh')
1810+
input_template.assign(output='/path/to/data/directory')
1811+
1812+
bash_executor = generate_simulation(
1813+
identifier='bash_executor',
1814+
path='bash_executor',
1815+
job=job(serial=True, app='bash'),
1816+
input=input_template,
1817+
outfiles=['list_directory_complete.txt']
1818+
)
1819+
1820+
run_project(bash_executor)
1821+
1822+
**Input Types**:
1823+
1824+
GenericSimulation accepts different types of input:
1825+
1826+
.. code:: python
1827+
1828+
# 1. Script file path (recommended)
1829+
input='scripts/data_generator.py'
1830+
1831+
# 2. Text content directly
1832+
input='''#!/usr/bin/env python3
1833+
print("Hello!")
1834+
'''
1835+
1836+
# 3. Template with variables and file path
1837+
input_template = input_template(filepath='scripts/list_directory.sh')
1838+
input_template.assign(output='/path/to/data')
1839+
input=input_template
1840+
1841+
# 4. Template with variables and text content
1842+
input_template = input_template(text='''#!/usr/bin/env python3
1843+
print("Hello! ${variable1} ${variable2}")
1844+
''')
1845+
input_template.assign(variable1='value1', variable2='value2')
1846+
input=input_template
1847+
1848+
Template-Based Dependencies
1849+
~~~~~~~~~~~~~~~~~~~~~~~~~~~
1850+
1851+
GenericSimulation supports template substitution for dependencies, allowing
1852+
simulations flexible options including accessing data from other simulations through automatic path
1853+
replacement.
1854+
1855+
**What input_template does**:
1856+
1857+
The ``input_template`` function creates a template object that can substitute
1858+
variables in script files. It automatically replaces template variables with
1859+
actual values at runtime.
1860+
1861+
**Complete Workflow Example** (simplified from ``python_demo.py``):
1862+
1863+
.. code:: python
1864+
1865+
from nexus import generate_simulation, input_template, job, run_project
1866+
import os
1867+
1868+
# First simulation - generates data
1869+
data_generator = generate_simulation(
1870+
identifier='data_generator',
1871+
path='data_generator',
1872+
job=job(serial=True, app='python3'),
1873+
input='scripts/data_generator.py',
1874+
outfiles=['data_generation_complete.txt']
1875+
)
1876+
1877+
# Second simulation - processes data from first simulation
1878+
input_template = input_template(filepath='scripts/data_processor.py')
1879+
input_template.assign(output=os.path.abspath(data_generator.locdir))
1880+
1881+
data_processor = generate_simulation(
1882+
identifier='data_processor',
1883+
path='data_processor',
1884+
job=job(serial=True, app='python3'),
1885+
input=input_template,
1886+
outfiles=['data_processing_complete.txt'],
1887+
dependencies=[(data_generator, 'other')] # data_generator must complete first
1888+
)
1889+
1890+
run_project()
1891+
1892+
**Template Variable Usage in Scripts**:
1893+
1894+
In your script files, use template variables that will be replaced at runtime:
1895+
1896+
**Python Script Example** (simplified from ``data_processor.py``):
1897+
1898+
.. code:: python
1899+
1900+
# Template variable will be replaced with actual path
1901+
data_dir = '${output}' # Replaced with data_generator's output directory path
1902+
1903+
print(f"Processing data from: {data_dir}")
1904+
1905+
# Use the variable in your code
1906+
with open(f"{data_dir}/data/matrix.txt", 'r') as f:
1907+
data = f.read()
1908+
1909+
**Bash Script Example** (simplified from ``list_directory.sh``):
1910+
1911+
.. code:: bash
1912+
1913+
# Template variable will be replaced with actual path
1914+
ls -alth ${output} # Lists contents of the dependency directory
1915+
ls $$HOME # Additional $: escaped $ for literal use in bash
1916+
1917+
1918+
Script Templates
1919+
~~~~~~~~~~~~~~~~
1920+
1921+
GenericSimulation scripts should follow a simple template. The choice between
1922+
Python and Bash depends on your specific needs:
1923+
1924+
**Python Script Template** (simplified from ``data_generator.py``):
1925+
1926+
.. code:: python
1927+
1928+
#!/usr/bin/env python3
1929+
import sys
1930+
import os
1931+
1932+
try:
1933+
# Your code here
1934+
print("=== Data Generator Simulation ===")
1935+
1936+
# Process data, run calculations, etc.
1937+
# Template variables are automatically replaced
1938+
data_path = '${output}' # If using templates
1939+
print(f"Processing data from: {data_path}")
1940+
1941+
# Create output files
1942+
with open('data_generation_complete.txt', 'w') as f:
1943+
f.write("Data generation completed successfully\n")
1944+
1945+
print("Simulation completed successfully!")
1946+
1947+
except Exception as e:
1948+
# Error handling
1949+
print(f"Simulation failed: {e}")
1950+
sys.exit(1)
1951+
1952+
**Bash Script Template** (from ``list_directory.sh``):
1953+
1954+
.. code:: bash
1955+
1956+
#!/bin/bash
1957+
# Template variables are automatically replaced
1958+
ls -alth ${output} # Lists contents of dependency directory
1959+
ls $$HOME # Escaped $ for literal use
1960+
1961+
# Create output files
1962+
echo "List directory completed" > ./list_directory_complete.txt
1963+
1964+
1965+
**Important Note for Bash Scripts**: The template system uses Python's ``string.Template``
1966+
class with ``$`` as the default delimiter. This can conflict with bash variable syntax
1967+
where ``$`` is used for variable substitution. When writing bash scripts for use with
1968+
GenericSimulation, be careful using ``$``. For example:
1969+
1970+
- Use ``${output}`` for template variables (will be replaced)
1971+
- Avoid ``$HOME``, ``$PATH``, etc. in bash scripts, use ``$$`` to escape a literal ``$`` character if needed (e.g. ``$$HOME`` )
1972+
1973+
This limitation exists because templating is based on the standard Python ``string.Template``
1974+
class, and unless there is significant demand for a more flexible templating method,
1975+
this approach will be maintained.
1976+
1977+
**Extensibility and Custom Executables**:
1978+
1979+
GenericSimulation can run any executable, not just Python and Bash scripts:
1980+
1981+
.. code:: python
1982+
1983+
# Custom executable
1984+
custom_sim = generate_simulation(
1985+
identifier='custom_tool',
1986+
path='./custom_tool',
1987+
job=job(serial=True, app='/path/to/your/executable'),
1988+
input='input_file.txt',
1989+
outfiles=['output.txt']
1990+
)
1991+
1992+
# MATLAB script
1993+
matlab_sim = generate_simulation(
1994+
identifier='matlab_calc',
1995+
path='./matlab_calc',
1996+
job=job(serial=True, app='matlab'),
1997+
input="run_script.m",
1998+
outfiles=['results.mat']
1999+
)
2000+
2001+
# R script
2002+
r_sim = generate_simulation(
2003+
identifier='r_analysis',
2004+
path='./r_analysis',
2005+
job=job(serial=True, app='Rscript'),
2006+
input='analysis.R',
2007+
outfiles=['plots.pdf', 'data.csv']
2008+
)
2009+
2010+
This makes GenericSimulation extremely flexible for integrating any computational
2011+
tool into Nexus workflows, as long as the tool can read input files and produce
2012+
output files that can be tracked for completion status. For templating input files,
2013+
``input_template`` function is based on the standard Python ``string.Template`` class,
2014+
hence uses ``$`` as the default delimiter. If users want to use a different delimiter,
2015+
that is more suitable for their code syntax, they can develop their own template class.
Lines changed: 63 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,63 @@
1+
#!/usr/bin/env python3
2+
3+
"""
4+
Bash Demo: Demonstration of GenericSimulation class with bash commands.
5+
This shows how to use GenericSimulation with Python scripts that execute bash commands.
6+
7+
Features demonstrated:
8+
- Python data generation
9+
- Python script using subprocess to execute bash commands
10+
- Template-based dependency handling
11+
- Output file tracking for completion status
12+
"""
13+
14+
import os
15+
from nexus import settings, job, run_project
16+
from nexus import generate_simulation, input_template
17+
18+
# Configure Nexus settings
19+
settings(
20+
runs = 'runs/generic_bash',
21+
results = '',
22+
sleep = 1,
23+
machine = 'ws16',
24+
)
25+
26+
print("Bash Demo: GenericSimulation with bash commands...")
27+
print("This shows: Python data generation -> Python script using subprocess for bash commands")
28+
print()
29+
30+
# First simulation: Data generator (Python) using GenericSimulation
31+
# Read the script from file
32+
script_path = os.path.join('scripts', 'data_generator.py')
33+
34+
# Create data generator using GenericSimulation
35+
dg_outfiles = ["data/"+f+".txt" for f in 'matrix statistics x_values y_values'.split()] + ["data_generation_complete.txt"]
36+
# data_generation_complete.txt file is produced by data_generator.py script
37+
# Tracking a completion file is optional, if not provided,
38+
# the simulation will be considered finished after the script execution is complete.
39+
data_generator = generate_simulation(
40+
identifier = 'data_generator',
41+
path = 'data_generator',
42+
job = job(serial=True, app='python3'),
43+
input = script_path, # Pass script file path
44+
outfiles = dg_outfiles, # Specify completion files
45+
)
46+
47+
# Second simulation: Bash command executor with python dependency
48+
lister_script_path = os.path.join('scripts', 'list_directory.sh')
49+
50+
input_dl = input_template(filepath=lister_script_path)
51+
input_dl.assign(output=os.path.abspath(data_generator.locdir))
52+
53+
bash_executor = generate_simulation(
54+
identifier = 'bash_executor',
55+
path = 'bash_executor',
56+
job = job(serial=True, app='bash'),
57+
input = input_dl, # Pass script file path
58+
outfiles = ['list_directory_complete.txt'], # Specify completion file
59+
dependencies = [(data_generator, 'other')],
60+
)
61+
62+
# Run the project
63+
run_project()

0 commit comments

Comments
 (0)