Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
39 commits
Select commit Hold shift + click to select a range
785c7ec
added preprocessing and trajectory_projector
awestphal1 Nov 13, 2025
61be7f9
started editing user guide, added test file
awestphal1 Nov 26, 2025
12111b4
Fixed Float_Precision Problem and added tests
awestphal1 Nov 27, 2025
c5d0f8c
Current update
awestphal1 Dec 2, 2025
9ac8ec5
new distance function
awestphal1 Dec 4, 2025
1d47c0b
errors for false parameters, better structure, method for catching in…
Dec 4, 2025
15d9c87
Added Errors for Exceptions
Dec 4, 2025
62caaa6
Finish documentation
Dec 7, 2025
16c537b
Merge branch 'preproccesing' into main
Dec 7, 2025
a3d82d6
changed list structure for geo_data into dataframes
awestphal1 Feb 11, 2026
9414d93
Fixed Docstring Notation
awestphal1 Feb 12, 2026
8030b73
Updated min_ - / max_distance to user_guide
awestphal1 Feb 12, 2026
2e33e50
restored docs/source/ notebooks from main
awestphal1 Feb 12, 2026
bde30d7
Merge branch 'preproccesing'
awestphal1 Feb 12, 2026
46ac491
Merge branch 'PedestrianDynamics:main' into main
awestphal1 Feb 12, 2026
219a5d3
Fixed destroyed symlinks
awestphal1 Feb 24, 2026
3bd2c3b
Symlinks describtion for Windows User in developeterguide
awestphal1 Mar 3, 2026
f19b5be
Fixed import statements
awestphal1 Mar 3, 2026
2ac34d1
added a visualization for the min-/max-distance parameters
awestphal1 Mar 3, 2026
9aabe50
added a visualization for the min-/max-distance parameters
awestphal1 Mar 3, 2026
afa0168
Merge branch 'PedestrianDynamics:main' into main
awestphal1 Mar 3, 2026
07a1534
deleted worse .svg image
awestphal1 Mar 3, 2026
cb39276
Merge branch 'documentation_preprocessing'
awestphal1 Mar 3, 2026
78a8adf
Merge branch 'main' into main
awestphal1 Mar 17, 2026
4d9ef1d
Improved the plot for showing preprocessing parameters and corrected …
awestphal1 Mar 17, 2026
29ca64a
Merge branch 'main' of https://github.com/awestphal1/PedPy
awestphal1 Mar 17, 2026
d26ce82
adapted init so tests hopefully run through
awestphal1 Mar 17, 2026
b38b4d9
Merge branch 'PedestrianDynamics:main' into main
awestphal1 Apr 2, 2026
4e94f79
added file trajectory_oulier_detetction + code
awestphal1 Apr 5, 2026
e2222b2
implemented tests for outlier detection
awestphal1 Apr 7, 2026
9b5eb0c
fixed personID counter for drops_only
awestphal1 Apr 7, 2026
e507e2f
added outlier_detection documenatation to preprocessing notebook
awestphal1 Apr 11, 2026
abf8462
finished outlier_detection and correcting_invalid_traj in preprocessi…
awestphal1 Apr 13, 2026
7ed1a73
Merge branch 'PedestrianDynamics:main' into main
awestphal1 Apr 17, 2026
805cda4
finished documentation
awestphal1 Apr 17, 2026
5f9f0f3
Merge branch 'detect_outliers'
awestphal1 Apr 17, 2026
90297cc
pre-commit issue solved
awestphal1 Apr 17, 2026
90a465d
improved documentation
awestphal1 Apr 20, 2026
f64e8f2
Edited the preprocessing notebook
awestphal1 May 5, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2,121 changes: 2,121 additions & 0 deletions docs/source/images/invalid_trajectory_person_7_corrected.svg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
1,163 changes: 1,163 additions & 0 deletions docs/source/images/invalid_trajectory_person_7_original.svg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
61,876 changes: 61,876 additions & 0 deletions notebooks/demo-data/preprocessing/030_c_56_h0_invalid.txt

Large diffs are not rendered by default.

224,184 changes: 224,184 additions & 0 deletions notebooks/demo-data/preprocessing/uni_corr_500_08_modified.txt

Large diffs are not rendered by default.

316 changes: 316 additions & 0 deletions notebooks/preprocessing.ipynb
Original file line number Diff line number Diff line change
@@ -0,0 +1,316 @@
{
"cells": [
{
"cell_type": "code",
"execution_count": null,
"id": "0",
"metadata": {},
"outputs": [],
"source": [
"import pathlib\n",
"\n",
"import matplotlib.pyplot as plt\n",
"import shapely\n",
"from matplotlib.lines import Line2D\n",
"\n",
"import pedpy\n",
"from pedpy.plotting.plotting import PEDPY_ORANGE, PEDPY_PETROL"
]
},
{
"cell_type": "markdown",
"id": "1",
"metadata": {},
"source": [
"# Preprocessing\n",
"\n",
"Pedpy provides functions for preprocessing:\n",
"\n",
"1. Outlier detection\n",
"2. Correcting invalid trajectories\n"
]
},
{
"cell_type": "markdown",
"id": "2",
"metadata": {},
"source": [
"## Outlier detection\n",
"\n",
"*PedPy* provides a function that detects and corrects outliers and also detects vertical displacements within the trajectory, which occur when the tracking of a person is interrupted and the tracker continues tracking something else instead.\n",
"\n",
"The algorithm for detecting outliers splits the trajectory into multiple dataframes, one per person, and calculates the distance between each pair of consecutive points. The expected distance d is defined as the 99% quantile of the distances between all consecutive points, multiplied by the tolerance t.\n",
"\n",
"$$\n",
"d = t * q_{0.99}\n",
"$$\n",
"\n",
"##### tolerance:\n",
"The tolerance parameter can be chosen manually. A low value for this parameter means a low tolerance for potential outliers, which can be useful in trajectories where pedestrians’ speed stays within a similar range. If pedestrian speed varies, for example in bottleneck experiments, the tolerance should be chosen higher. A value between 2 and 10 should cover most cases.\n",
"\n",
"\n",
"If an outlier is detected, the program checks whether there are consecutive outliers. Since the distance can no longer be used as an indicator, the function searches for the next frame within a realistic range r. Every subsequent frame that is not within this range is also considered an outlier, and the factor n is increased by one. In this case, as points should not be considered valid again by accident, the tolerance t' is much smaller.\n",
"\n",
"$$\n",
"r = n * t' * q_{0.99}\n",
"$$\n",
"\n",
"##### quantile:\n",
"\n",
"Like the tolerance, the quantile for the expected distance can also be chosen manually. This also influences the tolerance.\n",
"\n",
"For every part of the trajectory, where anomalies were detected, the corresponding person id and frames, where outlier occurred, are put into the log output.\n",
"\n",
"Outliers in the middle of the trajectory are corrected by interpolating the incorrect points as a straight line between the two correct points before and after the outlier occurs. Outliers at the beginning or at the end are extrapolated in the average direction of the trajectory."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "3",
"metadata": {},
"outputs": [],
"source": [
"trajectory_data = pedpy.load_trajectory(\n",
" trajectory_file=pathlib.Path(\"demo-data/preprocessing/uni_corr_500_08_modified.txt\"),\n",
" default_unit=pedpy.TrajectoryUnit.METER,\n",
")\n",
"trajectory_data_corrected, changed_index_orig, changed_index_new = pedpy.detect_anomalies_in_trajectories(\n",
" trajectory_data, tolerance=6, quantile=0.98\n",
")"
]
},
{
"cell_type": "markdown",
"id": "4",
"metadata": {},
"source": [
"\n",
"### Invalid trajectories\n",
"\n",
"If in a trajectory data set of a single person id more that certain percentage of all frames were considered outliers, this part of the trajectory is considered invalid.\n",
"\n",
"##### percentage_invalid:\n",
"This percentage mentioned above can be chosen manually by the percentage_invalid parameter, an integer parameter between 1 and 100. The default value is 20%.\n",
"\n",
"##### deleting:\n",
"\n",
"The function provides the bool parameter deleting, where the user can determine, that invalid data sets should be removed in the returned trajectory.\n",
"\n",
"### Focus on displacement detection\n",
"\n",
"##### displacements_only:\n",
"\n",
"It is possible to filter only for displacements in the trajectory by setting displacements_only = True. In this case, anomalies that do not occur at the very beginning or the very end are ignored, and the trajectory data of the affected person ID is only cropped after a displacement. If outliers occur at the very beginning, they are removed as well.\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "5",
"metadata": {},
"outputs": [],
"source": [
"trajectory_data_jumps_only, index_orig, index_new = pedpy.detect_anomalies_in_trajectories(\n",
" trajectory_data, displacements_only=True\n",
")"
]
},
{
"cell_type": "markdown",
"id": "6",
"metadata": {},
"source": [
"### Other parameters\n",
"\n",
"In the following description the term trajectory means the trajectory data of a single person id.\n",
"\n",
"##### max_length:\n",
"An integer value. Sometimes it may happen that a few outliers occur directly one after another without a jump back to the correct trajectory. The max_length parameter defines how many frames long these consecutive outliers can be before the program checks whether this indicates a vertical displacement in the trajectory. The default value is 8.\n",
"\n",
"##### critical_length_traj:\n",
"The minimum length a trajectory can have. This integer value is only relevant in cases where it seems that there is a displacement in the trajectory. If the supposed displacement happens before the number of previous frames can be considered a trajectory in its own right, every frame before the detected anomaly is assumed to be an outlier. If the minimum length has already been reached, the trajectory is cropped at the displacement. The default value is 10% of the trajectory’s length."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "7",
"metadata": {},
"outputs": [],
"source": [
"traj_data_low_tolerance = pedpy.detect_anomalies_in_trajectories(\n",
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From the description above, it is not clear what the parameters do that you use in this example. Can you include parameters that you use in the description?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

" trajectory_data, tolerance=3, quantile=0.95, percentage_invalid=20, deleting=True, max_length=10\n",
")[0]"
]
},
{
"cell_type": "markdown",
"id": "8",
"metadata": {},
"source": [
"### Compare original and corrected trajectory\n",
"\n",
"The function returns a corrected copy of the input trajectory data. Furthermore, it returns two lists: the first contains all person ids of the parts of the original trajectory where anomalies were found, and the second contains the corresponding person IDs of the corrected trajectory. In most cases, these are the same; only if some person IDs were deleted do subsequent person IDs shift.\n",
"\n",
"These lists can be used to plot the trajectory segments to get an impression of the outliers and how they were corrected. The black line represents the original trajectory, and the blue line represents the corrected one."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "9",
"metadata": {},
"outputs": [],
"source": [
"walk_area = pedpy.WalkableArea(\n",
" shapely.from_wkt(\n",
" \"POLYGON ((10 -2, -10 -2, -10 7, 10 7, 10 -2), (9 6, -9 6, -9 5, 9 5, 9 6), (-9 -1, 9 -1, 9 0, -9 0, -9 -1))\"\n",
" )\n",
")\n",
"\n",
"%config InlineBackend.figure_format = 'retina'\n",
"\n",
"\n",
"for i in range(len(changed_index_orig)):\n",
" original_trajectory = trajectory_data.data[trajectory_data.data[\"id\"] == changed_index_orig[i]]\n",
" trajectory_corrected = trajectory_data_corrected.data[trajectory_data_corrected.data[\"id\"] == changed_index_new[i]]\n",
" pedpy.plot_trajectories(\n",
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some comments about the plots being made:

  • Can you choose PedPy colors?
  • The colors you chose have not enough contrast. At least on my screen, I cannot really distinguish between black and blue trajectories (see attached example). Please adjust the colors.
  • Please include a legend (which color represents what?)
  • Idea: Include titles to the plots stating which type of error wad detected.
Image

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. I tried to find a combination of two PedPy colors, that works with light- and with darkmode. I did not add a title, because depending on the parameters a trajectory could have outliers and a displacement later, so it is difficult to define a clear type.

" traj=pedpy.TrajectoryData(data=original_trajectory, frame_rate=trajectory_data.frame_rate),\n",
" walkable_area=walk_area,\n",
" traj_width=1.75,\n",
" traj_color=PEDPY_PETROL,\n",
" ).set_aspect(\"equal\")\n",
" pedpy.plot_trajectories(\n",
" traj=pedpy.TrajectoryData(data=trajectory_corrected, frame_rate=trajectory_data.frame_rate),\n",
" walkable_area=walk_area,\n",
" traj_width=0.5,\n",
" traj_color=PEDPY_ORANGE,\n",
" ).set_aspect(\"equal\")\n",
" legend_elements = [\n",
" Line2D([0], [0], color=PEDPY_PETROL, lw=2, label=\"Original\"),\n",
" Line2D([0], [0], color=PEDPY_ORANGE, lw=2, label=\"Corrected\"),\n",
" ]\n",
"\n",
" plt.legend(handles=legend_elements, bbox_to_anchor=(1, 1), fontsize=8)\n",
" plt.xlabel(f\"personID {changed_index_orig[i]} / {changed_index_new[i]}\")\n",
" plt.show()"
]
},
{
"cell_type": "markdown",
"id": "10",
"metadata": {},
"source": [
"## Correct invalid trajectories\n",
"\n",
"When working with head trajectories, participants may occasionally lean over obstacles. As a result, their trajectories can leave the walkable area for some frames, and this data cannot be processed by *PedPy*.\n",
"\n",
"To address this, there is a function that moves trajectory points that lay inside a wall or too close to it. The distance that should remain between the point and the wall afterwards is calculated by linear interpolation. The new distance lies within the interval between min_distance and max_distance:\n",
"\n",
"$$\n",
"d' = (d-b)*{(e-s) \\over (e-b)}+s\n",
"$$\n",
"\n",
"- d' is the new distance to the wall\n",
"- d is the original distance to the wall\n",
"- b corresponds to back_distance\n",
"- s corresponds to min_distance\n",
"- e corresponds to max_distance\n",
"\n",
"```{eval-rst}\n",
".. figure:: images/parameters_preprocessing.png\n",
" :width: 400px\n",
" :align: center\n",
"```\n",
"\n",
"If a point lies inside the geometry or too close to it, it will be pushed outward. The distance interval for these points starts at back_distance, which must be negative because it represents the maximum depth inside the wall, and ends at max_distance. Points located deeper inside an obstacle are assigned a smaller new distance than points located near the boundary of the interval.\n",
"\n",
"For example, a point, which lays deep inside an obstacle will receive a new distance close to min_distance, which represents the minimum possible value for new_distance. A point that is already outside the obstacle but needs to be adjusted for smoother results will also receive a new distance, but this value will be only slightly larger than its original distance.\n",
"\n",
"It is essential that max_distance is larger than min_distance, and that back_distance is negative. Depending on the geometry and the parameter values, it can also be beneficial to buffer the geometry beforehand to create thicker walls. If the walls are too thin, the function may accidentally move a point to the wrong side.\n",
"\n",
"The function returns a pedpy.TrajectoryData, either the corrected version of the trajectory or the\n",
" original trajectory, if the original trajectory was valid."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "11",
"metadata": {},
"outputs": [],
"source": [
"trajectory_data = pedpy.load_trajectory(\n",
" trajectory_file=pathlib.Path(\"demo-data/preprocessing/030_c_56_h0_invalid.txt\"),\n",
" default_unit=pedpy.TrajectoryUnit.METER,\n",
")\n",
"\n",
"walk_area = pedpy.WalkableArea(\n",
" [\n",
" (3.5, -2),\n",
" (3.5, 8),\n",
" (-3.5, 8),\n",
" (-3.5, -2),\n",
" ],\n",
" obstacles=[\n",
" [\n",
" (-0.7, -1.1),\n",
" (-0.25, -1.1),\n",
" (-0.25, -0.15),\n",
" (-0.4, 0.0),\n",
" (-2.8, 0.0),\n",
" (-2.8, 6.7),\n",
" (-3.05, 6.7),\n",
" (-3.05, -0.3),\n",
" (-0.7, -0.3),\n",
" (-0.7, -1.0),\n",
" ],\n",
" [\n",
" (0.25, -1.1),\n",
" (0.7, -1.1),\n",
" (0.7, -0.3),\n",
" (3.05, -0.3),\n",
" (3.05, 6.7),\n",
" (2.8, 6.7),\n",
" (2.8, 0.0),\n",
" (0.4, 0.0),\n",
" (0.25, -0.15),\n",
" (0.25, -1.1),\n",
" ],\n",
" ],\n",
")\n",
"\n",
"print(\"Valid before: \", pedpy.is_trajectory_valid(traj_data=trajectory_data, walkable_area=walk_area))\n",
"\n",
"valid_trajectory = pedpy.correct_invalid_trajectories(\n",
" trajectory_data=trajectory_data,\n",
" walkable_area=walk_area,\n",
" min_distance_obst=0.01,\n",
" max_distance_obst=0.05,\n",
" back_distance_obst=-0.5,\n",
" min_distance_wall=0.01,\n",
" max_distance_wall=0.05,\n",
" back_distance_wall=-0.5,\n",
")\n",
"print(\"Valid after: \", pedpy.is_trajectory_valid(traj_data=valid_trajectory, walkable_area=walk_area))"
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be nice to include two plots, the not corrected and the corrected trajectories.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. I added two exemplary plots to give an idea of how the function modifies the trajectory. I will add a plotting function to the notebook, similar to the one used for outlier detection, when I include a list of modified person IDs in the return values.

]
},
{
"cell_type": "markdown",
"id": "12",
"metadata": {},
"source": [
"The values for min_-/max_- and back_distance are chosen differentially for walls around the geometry and for obstacles within it.\n",
"\n",
"An example, how the function corrects invalid trajectories: The first plot shows the original invalid trajectory, the second plot the corrected one.\n",
"\n",
"![original](../docs/source/images/invalid_trajectory_person_7_original.svg)\n",
"![corrected](../docs/source/images/invalid_trajectory_person_7_corrected.svg)\n"
]
}
],
"metadata": {},
"nbformat": 4,
"nbformat_minor": 5
}
2 changes: 2 additions & 0 deletions pedpy/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -152,6 +152,7 @@
plot_voronoi_cells,
plot_walkable_area,
)
from .preprocessing.trajectory_outlier_detection import detect_anomalies_in_trajectories
from .preprocessing.trajectory_projector import correct_invalid_trajectories

__all__ = [ # noqa: RUF022 disable sorting of __all__ for better maintenance
Expand Down Expand Up @@ -217,6 +218,7 @@
"compute_mean_acceleration_per_frame",
"compute_voronoi_acceleration",
"correct_invalid_trajectories",
"detect_anomalies_in_trajectories",
"PEDPY_BLUE",
"PEDPY_GREEN",
"PEDPY_GREY",
Expand Down
Loading