Skip to content
View truevektor's full-sized avatar
  • Warsaw

Highlights

  • Pro

Block or report truevektor

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
truevektor/README.md

Michał Trojaczek

Criminal Procedure • Audio Forensics • Digital Evidence • Data Analysis
I use GitHub as a learning laboratory and a platform for transparent, reproducible technical-legal research.


👋 About Me

I work across the intersection of:

  • criminal procedure and evidence law,
  • audio forensics (diarization, transcription, signal analysis),
  • digital evidence engineering and chain-of-custody reconstruction,
  • technical building assessments (heating systems, thermal audits, documentation).

My repositories document how law, technology, measurement, and methodical reasoning can be integrated into reproducible workflows.

GitHub allows me to:

  • experiment with tools such as Python, Whisper, pyannote, ffmpeg,
  • build full evidence repositories for real cases (e.g., II K 70/24),
  • teach others how technical and legal inquiry complement one another.

🎓 GitHub Education – How I Use It

Within GitHub Education and the Student Developer Pack, I focus on:

  • building open case studies for students of law, forensics, and data science,
  • creating step-by-step pipelines for evidence analysis:
    • audio preprocessing,
    • diarization and transcription,
    • timing consistency checks,
    • comparison between courtroom recordings and official transcripts,
  • documenting research-ready, reproducible workflows.

My repositories are designed so that anyone can:

  1. clone the project,
  2. follow the README instructions,
  3. reproduce the analysis with full transparency.

🔍 Current Learning & Research Areas

  • advanced diarization and speaker-tracking (pyannote),
  • forensic-grade signal analysis (formants, silence segmentation, artifact detection),
  • automated procedural reporting (Python → DOCX/PDF pipelines),
  • GitHub Actions for:
    • auto-updating statistics,
    • generating forensic summaries,
    • validating data integrity in multi-file repositories.

🧪 Featured Educational Projects

IIK_70_24 – Digital Evidence Repository

A structured, multi-layered reproduction of a real criminal case, including:

  • evidence indexing (“digital case file” format),
  • audio recordings (M4A/OGG/WAV) + diarization,
  • transcript comparison tools,
  • chain-of-custody reconstruction,
  • educational notebooks explaining the methodology.

Intended use: legal tech courses, forensic audio workshops, digital evidence methodology training.


fonoskopia-tools

A modular pipeline for:

  • audio conversion and resampling (ffmpeg),
  • diarization (pyannote),
  • Whisper-based transcription,
  • formatting transcripts for legal review.

Includes well-commented workflows intended for beginners and advanced students.


joliot-curie-19a

A technical repository documenting:

  • heating & hot-water system diagnostics,
  • 3D measurement workflows (Leica DISTO X6),
  • thermal imaging interpretation (FLIR C5),
  • building audit methodology.

Intended use: building-science labs, civil engineering students, interdisciplinary research combining law & engineering.


🛠️ Tech Stack

Languages & Tools

  • Python (analysis, automation, reporting)
  • ffmpeg, sox (audio processing)
  • Whisper, pyannote-audio (speech & diarization)
  • Git, GitHub (Actions, LFS, structured evidence repositories)
  • Audacity / Adobe Audition (signal inspection)

Hardware

  • Dell Precision 7550 (CUDA)
  • Samsung Galaxy S24 Ultra (raw recordings, photogrammetry)
  • Leica DISTO X6, FLIR C5, Bosch GLL 3-80 C
  • HP Color LaserJet Pro M252dw (documentation output)

📈 Automated Stats

This profile uses GitHub Actions to automatically generate:

  • commit-activity graphs,
  • language-usage statistics,
  • repository analytics.

Example outputs:

Commit activity
Languages


🧭 Assignments / Labs (GitHub Education Compatible)

The following exercises are designed for students, educators, and researchers who want to explore digital forensics, legal-tech analysis, or data-driven methodology.

Lab 1 — Audio Evidence Integrity Check

Objective: Reproduce a basic forensic workflow.
Tasks:

  1. Clone fonoskopia-tools.
  2. Convert the sample audio files using ffmpeg.
  3. Run diarization and generate a segment map.
  4. Compare diarization timestamps with the transcript.
  5. Submit a short report explaining discrepancies.

Lab 2 — Transcript vs Recording Consistency

Objective: Identify procedural anomalies.
Tasks:

  1. Load a provided courtroom audio segment.
  2. Generate a Whisper transcription.
  3. Compare it line-by-line with the “official transcript”.
  4. Highlight mismatches related to:
    • omission,
    • misattribution of speakers,
    • altered sequencing.
  5. Discuss the impact on fair-trial guarantees.

Lab 3 — Chain of Custody Reconstruction

Objective: Understand digital evidence lifecycle.
Tasks:

  1. Navigate the IIK_70_24 directory structure.
  2. Review metadata in the metadane/ folder.
  3. Build a timeline of evidence creation, copying, and storage.
  4. Identify any “breaks” in the chain.
  5. Submit a formal structured report.

Lab 4 — Technical Building Audit (Interdisciplinary)

Objective: Apply engineering measurement workflow.
Tasks:

  1. Review photographic and thermographic documentation in joliot-curie-19a.
  2. Analyze heat-loss indicators and insulation patterns.
  3. Correlate sensor readings with structural documentation.
  4. Produce a short engineering-legal assessment.

Lab 5 — Automated Reporting with GitHub Actions

Objective: Learn reproducible research using CI.
Tasks:

  1. Copy the provided workflow.
  2. Configure a daily stats updater.
  3. Add a Python script generating a PDF summary.
  4. Publish results to the repository’s README.

🤝 Collaboration and Academic Use

I welcome collaboration from:

  • students and researchers in law, digital forensics, signal processing,
  • engineering and data-science programs,
  • instructors looking for real-world, reproducible case studies.

For academic or research inquiries:
michal@trojaczek.com


“Theory ends where evidence begins. The rest is documentation.”

Popular repositories Loading

  1. truevektor truevektor Public

    My personal repository

  2. skills-communicate-using-markdown skills-communicate-using-markdown Public

    My clone repository

  3. skills-introduction-to-github skills-introduction-to-github Public

    Exercise: Introduction to GitHub

  4. whisper.cpp whisper.cpp Public

    Forked from ggml-org/whisper.cpp

    Port of OpenAI's Whisper model in C/C++

    C++