Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
26 commits
Select commit Hold shift + click to select a range
101b11e
Refactor filtering to follow DRY
php1ic Mar 29, 2026
292c4f0
Use multiple inheritance rather than a chain
php1ic Mar 29, 2026
e81454c
Tidy up the NUBASE column labelling
php1ic Mar 29, 2026
9547733
AME mass file inheritance refactor
php1ic Mar 29, 2026
132073d
AME reaction file 1 inheritance refactor
php1ic Mar 29, 2026
5f98806
AME reaction file 2 inheritance refactor
php1ic Mar 29, 2026
c229570
Remove redundant year checks
php1ic Mar 29, 2026
f182c82
Use the function rather than raw dictionary access
php1ic Mar 29, 2026
181b922
As with c229570 remove redundant checks
php1ic Mar 29, 2026
ff2cf90
Refactor NUBASE parsing into a dedicated class
php1ic Mar 29, 2026
a48f11a
Remove isort config in favour of ruff
php1ic Apr 1, 2026
3cfc7de
Update ruff sorting config
php1ic Apr 1, 2026
70fd5e5
Generalise read_fwf for multiple input formats
php1ic Apr 1, 2026
0356c0d
Add tests against the relative error calculations
php1ic Apr 1, 2026
602b43a
Top level classes to deal with all AME and NUBASE data
php1ic Apr 1, 2026
31f5b6a
Update the MassTable class to make use of the new structure
php1ic Apr 1, 2026
5e8bb70
Add tests for the new top level classes
php1ic Apr 1, 2026
40abc4d
Add some test to the MassTable after the refactor
php1ic Apr 4, 2026
bdb16e6
Add more test coverage for MassTable
php1ic Apr 4, 2026
2ea5d98
Correct index testing
php1ic Apr 5, 2026
f0b3547
Update unit checking with more non-time units
php1ic Apr 5, 2026
a2d317e
Remove isort check from linting command
php1ic Apr 5, 2026
f1eb793
Allow a MassTable instance to be subscriptable
php1ic Apr 5, 2026
39e763a
Revert all functionality based around top level dataframe access
php1ic Apr 5, 2026
ad48107
Update usage examples in the README
php1ic Apr 5, 2026
1658d02
Rename file so it matches the class within
php1ic Apr 5, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
114 changes: 56 additions & 58 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -54,84 +54,82 @@ git clone https://github.com/php1ic/nuclearmasses
> While every effort is made to maintain a stable API, this module is relatively new so users should not be surprised if there are changes between versions.
> If a breaking change has been introduced, it will always be highlighted in the [CHANGELOG](CHANGELOG.md).

Once installed or cloned, the data is available as a single dataframe indexed on the mass table year
The combination of AME and NUBASE values from all years is available as a single dataframe
```python
>>> from nuclearmasses.mass_table import MassTable
>>> df = MassTable().full_data
>>> df = MassTable().data
```
You can then interrogate, or extract, whatever information you want.
For example, how has the mass excess and it's accuracy changed overtime for 190Re according to the AME
```python
>>> df[(df['A'] == 190) & (df['Symbol'] == 'Re')][['AMEMassExcess', 'AMEMassExcessError']]
AMEMassExcess AMEMassExcessError
TableYear
1983 -35536.605 200.029
1993 -35557.789 145.549
1995 -35568.032 212.151
2003 -35566.326 149.248
2012 -35634.992 70.542
2016 -35635.830 70.852
2020 -35583.015 4.870
AMEMassExcess AMEMassExcessError
16054 -35536.605 200.029
16055 -35557.789 145.549
16056 -35568.032 212.151
16057 -35566.326 149.248
16058 -35634.992 70.542
16059 -35635.830 70.852
16060 -35583.015 4.870
```
Or how does the mass excess of gold vary across the isotopic chain according to NUBASE in the most recent table for both experimentally measured and theoretical values
```python
>>> df.query("TableYear == 2020 and Symbol == 'Au'")[['A', 'NUBASEMassExcess', 'NUBASEMassExcessError', 'Experimental']]
A NUBASEMassExcess NUBASEMassExcessError Experimental
TableYear
2020 168 2530.0 400.0 False
2020 169 -1790.0 300.0 False
2020 170 -3700.0 200.0 False
2020 171 -7562.0 21.0 True
2020 172 -9320.0 60.0 True
2020 173 -12832.0 23.0 True
2020 174 -14060.0 100.0 False
2020 175 -17400.0 40.0 True
2020 176 -18520.0 30.0 True
2020 177 -21546.0 10.0 True
2020 178 -22303.0 10.0 True
2020 179 -24989.0 12.0 True
2020 180 -25626.0 5.0 True
2020 181 -27871.0 20.0 True
2020 182 -28304.0 19.0 True
2020 183 -30191.0 9.0 True
2020 184 -30319.0 22.0 True
2020 185 -31858.1 2.6 True
2020 186 -31715.0 21.0 True
2020 187 -33029.0 22.0 True
2020 188 -32371.3 2.7 True
2020 189 -33582.0 20.0 True
2020 190 -32834.0 3.0 True
2020 191 -33798.0 5.0 True
2020 192 -32772.0 16.0 True
2020 193 -33405.0 9.0 True
2020 194 -32211.9 2.1 True
2020 195 -32567.1 1.1 True
2020 196 -31138.7 3.0 True
2020 197 -31139.8 0.5 True
2020 198 -29580.8 0.5 True
2020 199 -29093.8 0.5 True
2020 200 -27240.0 27.0 True
2020 201 -26401.0 3.0 True
2020 202 -24353.0 23.0 True
2020 203 -23143.0 3.0 True
2020 204 -20390.0 200.0 False
2020 205 -18570.0 200.0 False
2020 206 -14190.0 300.0 False
2020 207 -10640.0 300.0 False
2020 208 -5910.0 300.0 False
2020 209 -2230.0 400.0 False
2020 210 2680.0 400.0 False
A NUBASEMassExcess NUBASEMassExcessError Experimental
14084 168 2530.0 400.0 False
14189 169 -1790.0 300.0 False
14291 170 -3700.0 200.0 False
14391 171 -7562.0 21.0 True
14492 172 -9320.0 60.0 True
14591 173 -12832.0 23.0 True
14687 174 -14060.0 100.0 False
14781 175 -17400.0 40.0 True
14874 176 -18520.0 30.0 True
14968 177 -21546.0 10.0 True
15060 178 -22303.0 10.0 True
15153 179 -24989.0 12.0 True
15244 180 -25626.0 5.0 True
15334 181 -27871.0 20.0 True
15419 182 -28304.0 19.0 True
15503 183 -30191.0 9.0 True
15588 184 -30319.0 22.0 True
15673 185 -31858.1 2.6 True
15757 186 -31715.0 21.0 True
15842 187 -33029.0 22.0 True
15926 188 -32371.3 2.7 True
16007 189 -33582.0 20.0 True
16088 190 -32834.0 3.0 True
16164 191 -33798.0 5.0 True
16243 192 -32772.0 16.0 True
16320 193 -33405.0 9.0 True
16401 194 -32211.9 2.1 True
16480 195 -32567.1 1.1 True
16560 196 -31138.7 3.0 True
16637 197 -31139.8 0.5 True
16713 198 -29580.8 0.5 True
16788 199 -29093.8 0.5 True
16861 200 -27240.0 27.0 True
16935 201 -26401.0 3.0 True
17012 202 -24353.0 23.0 True
17089 203 -23143.0 3.0 True
17163 204 -20390.0 200.0 False
17237 205 -18570.0 200.0 False
17308 206 -14190.0 300.0 False
17382 207 -10640.0 300.0 False
17456 208 -5910.0 300.0 False
17528 209 -2230.0 400.0 False
17603 210 2680.0 400.0 False
```

## Contributing

If you have ideas for additional functionality or find bugs please create an [issue](https://github.com/php1ic/nuclearmasses/issues) or better yet a [pull request](https://github.com/php1ic/nuclearmasses/pulls).

We use a combination of [isort](https://pycqa.github.io/isort/), [ruff](https://docs.astral.sh/ruff/) and [mypy](https://www.mypy-lang.org/) to keep things tidy and hopefully catch errors and bugs before they happen.
We use a combination of [ruff](https://docs.astral.sh/ruff/) and [mypy](https://www.mypy-lang.org/) to keep things tidy and hopefully catch errors and bugs before they happen.
The command below returns no errors or issues so should be run after any code changes.
We might add a CI pipeline in the future, but for the moment, it's a manual process.
```bash
isort . && ruff format && ruff check && mypy src
ruff format && ruff check && mypy src
```

## Known issues
Expand Down
7 changes: 3 additions & 4 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -75,17 +75,16 @@ omit = [
"tests/*"
]

[tool.isort]
known_first_party = ["nuclearmasses"]

[tool.ruff]
line-length = 120

[tool.ruff.format]
# Nothing different from defaults

[tool.ruff.lint.isort]
known-first-party = ["nuclearmasses"]
known-first-party = ["src"]
from-first = false
order-by-type = true
force-sort-within-sections = true

[tool.ruff.lint]
Expand Down
49 changes: 49 additions & 0 deletions src/nuclearmasses/io/ame.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,49 @@
from importlib.resources.abc import Traversable

import pandas as pd

from nuclearmasses.io.ame_mass_parse import AMEMassParser
from nuclearmasses.io.ame_reaction_1_parse import AMEReactionParserOne
from nuclearmasses.io.ame_reaction_2_parse import AMEReactionParserTwo


class AME:
"""Top level storage and functionality for AME data"""

def __init__(self, data_path: Traversable):
self.data_path = data_path
self.years: list[int] = [1983, 1993, 1995, 2003, 2012, 2016, 2020]
self.ame_files: list[tuple[str, str, str]] = [
("mass.mas83", "rct1.mas83", "rct2.mas83"),
("mass_exp.mas93", "rct1_exp.mas93", "rct2_exp.mas93"),
("mass_exp.mas95", "rct1_exp.mas95", "rct2_exp.mas95"),
("mass.mas03", "rct1.mas03", "rct2.mas03"),
("mass.mas12", "rct1.mas12", "rct2.mas12"),
("mass16.txt", "rct1-16.txt", "rct2-16.txt"),
("mass.mas20", "rct1.mas20", "rct2.mas20"),
]
self.files: dict[int, tuple[str, str, str]] = dict(zip(self.years, self.ame_files, strict=True))
self.ame_df: pd.DataFrame = self.parse_all_years()

def get_datafiles(self, year: int) -> tuple[Traversable, Traversable, Traversable]:
"""Use the given year to locate the 3 AME data file and return the absolute paths."""
root = self.data_path / str(year)
mass, rct1, rct2 = self.files[year]

return root / mass, root / rct1, root / rct2

def parse_year(self, year: int) -> pd.DataFrame:
"""Combine all the AME files from the given ``year``"""
ame_mass, ame_reaction_1, ame_reaction_2 = self.get_datafiles(year)

mass_df = AMEMassParser(filename=ame_mass, year=year).read_file()
rct1_df = AMEReactionParserOne(filename=ame_reaction_1, year=year).read_file()
rct2_df = AMEReactionParserTwo(filename=ame_reaction_2, year=year).read_file()

# Merge all 3 of the AME dataframes into one
common_columns = ["A", "Z", "N", "TableYear", "Symbol"]
return mass_df.merge(rct1_df, on=common_columns, how="outer").merge(rct2_df, on=common_columns, how="outer")

def parse_all_years(self) -> pd.DataFrame:
"""Parse the files for all available years"""
return pd.concat((self.parse_year(y) for y in self.years), ignore_index=True)
9 changes: 3 additions & 6 deletions src/nuclearmasses/io/ame_mass_file.py
Original file line number Diff line number Diff line change
@@ -1,11 +1,8 @@
from nuclearmasses.utils.converter import Converter


class AMEMassFile(Converter):
class AMEMassFile:
"""Easy access to the variables in the AME mass file."""

def __init__(self, year: int):
super().__init__()
def __init__(self, year: int, **kwargs):
super().__init__(**kwargs)
match year:
case 1983:
self.HEADER = 35
Expand Down
109 changes: 58 additions & 51 deletions src/nuclearmasses/io/ame_mass_parse.py
Original file line number Diff line number Diff line change
@@ -1,78 +1,84 @@
import logging
import pathlib

import pandas as pd

from nuclearmasses.io.ame_mass_file import AMEMassFile
from nuclearmasses.utils.converter import Converter, DataInput


class AMEMassParser(AMEMassFile):
class AMEMassParser(AMEMassFile, Converter):
"""Parse the AME mass file.

The format is known but the provided string does not match all lines.
We will therefore use START and END markers, which are inherited, and
read the columns are interested in.
"""

def __init__(self, filename: pathlib.Path, year: int):
def __init__(self, filename: DataInput, year: int):
"""Set the file to read and table year"""
self.filename: pathlib.Path = filename
super().__init__(year=year)
self.filename: DataInput = filename
self.year: int = year
super().__init__(self.year)
logging.info(f"Reading {self.filename} from {self.year}")

def _column_names(self) -> list[str]:
"""Set the column name depending on the year"""
match self.year:
case _:
return [
"Z",
"A",
"AMEMassExcess",
"AMEMassExcessError",
"BindingEnergyPerA",
"BindingEnergyPerAError",
"BetaDecayEnergy",
"BetaDecayEnergyError",
"AtomicNumber",
"AtomicMass",
"AtomicMassError",
]
return [
"Z",
"A",
"AMEMassExcess",
"AMEMassExcessError",
"BindingEnergyPerA",
"BindingEnergyPerAError",
"BetaDecayEnergy",
"BetaDecayEnergyError",
"AtomicNumber",
"AtomicMass",
"AtomicMassError",
]

def _data_types(self) -> dict:
"""Set the data type depending on the year"""
match self.year:
case _:
return {
"TableYear": "Int64",
"Symbol": "string",
"N": "Int64",
"Z": "Int64",
"A": "Int64",
"AMEMassExcess": "float64",
"AMEMassExcessError": "float64",
"BindingEnergyPerA": "float64",
"BindingEnergyPerAError": "float64",
"BetaDecayEnergy": "float64",
"BetaDecayEnergyError": "float64",
"AtomicMass": "float64",
"AtomicMassError": "float64",
}
return {
"TableYear": "Int64",
"Symbol": "string",
"N": "Int64",
"Z": "Int64",
"A": "Int64",
"AMEMassExcess": "float64",
"AMEMassExcessError": "float64",
"BindingEnergyPerA": "float64",
"BindingEnergyPerAError": "float64",
"BetaDecayEnergy": "float64",
"BetaDecayEnergyError": "float64",
"AtomicMass": "float64",
"AtomicMassError": "float64",
}

def _na_values(self) -> dict:
"""Set the columns that have placeholder values"""
match self.year:
case 1983:
return {
"A": [""],
"BetaDecayEnergy": ["", "*"],
"BetaDecayEnergyError": ["", "*"],
}
case _:
return {
"BetaDecayEnergy": ["", "*"],
"BetaDecayEnergyError": ["", "*"],
}
na_vals = {
"A": [""],
"BetaDecayEnergy": ["", "*"],
"BetaDecayEnergyError": ["", "*"],
}

if self.year != 1983:
na_vals.pop("A")

return na_vals

def calculate_relative_error(self, raw_df) -> pd.DataFrame:
"""Calculate the relative error of the mass excess

12C has a 0.0 +/- 0.0 mass excess definition by definition so ensure that is still true.
"""
raw_df["AMERelativeError"] = abs(
raw_df["AMEMassExcessError"].astype(float) / raw_df["AMEMassExcess"].astype(float)
)
raw_df.loc[(raw_df.Z == 6) & (raw_df.A == 12), "AMERelativeError"] = 0.0

return raw_df

def read_file(self) -> pd.DataFrame:
"""Read the file using it's known format
Expand All @@ -81,7 +87,7 @@ def read_file(self) -> pd.DataFrame:
column names, data types and locations of the date so we can now make the generic
call to parse the file.
"""
df = pd.read_fwf(
df = Converter.read_fwf(
self.filename,
colspecs=self.column_limits,
names=self._column_names(),
Expand Down Expand Up @@ -119,9 +125,10 @@ def read_file(self) -> pd.DataFrame:

# We need to rescale the error value because we combined the two columns above
df = df.assign(AtomicMassError=df["AtomicMassError"].astype(float) / 1.0e6)
df = self.calculate_relative_error(df)

df["TableYear"] = self.year
df["N"] = pd.to_numeric(df["A"]) - pd.to_numeric(df["Z"])
df["Symbol"] = pd.to_numeric(df["Z"]).map(self.z_to_symbol)
df["Symbol"] = pd.to_numeric(df["Z"]).map(self.get_symbol)

return df.astype(self._data_types())
9 changes: 3 additions & 6 deletions src/nuclearmasses/io/ame_reaction_1_file.py
Original file line number Diff line number Diff line change
@@ -1,12 +1,9 @@
from nuclearmasses.utils.converter import Converter


class AMEReactionFileOne(Converter):
class AMEReactionFileOne:
"""Easy access to the variables in the first AME reaction file."""

def __init__(self, year: int):
def __init__(self, year: int, **kwargs):
"""Setup the values that locate the variable."""
super().__init__()
super().__init__(**kwargs)
match year:
case 1983:
self.HEADER = 30
Expand Down
Loading
Loading