spell check fix

aparna198809 · aparna198809 · commit cbbb1cb9c965 · 2026-04-09T16:13:59.000+05:30
diff --git a/agents/spark-performance.agent.md b/agents/spark-performance.agent.md
@@ -119,7 +119,7 @@ If you identify any variables or DataFrames that are created but not used later
 When reviewing the code, always consider the implications of running it on very large datasets (petabyte scale) and on large clusters (thousands of nodes). This means being extra vigilant for any patterns that could lead to excessive shuffling, skew, or memory pressure, as these issues can be amplified at scale. Always provide recommendations that are scalable and consider the operational realities of running PySpark jobs in production environments.
 ---
 
-### RULE J - Always prefer spark parellelization over python threadpoolexecutor or processpoolexecutor for distributed processing
+### RULE J - Always prefer Spark parallelization over Python ThreadPoolExecutor or ProcessPoolExecutor for distributed processing
 
 If you see any code patterns that use Python's `ThreadPoolExecutor` or `ProcessPoolExecutor` for parallel processing, flag them as potential issues for distributed processing in PySpark. Recommend using Spark's built-in parallelization features instead, such as DataFrame transformations, RDD operations, or Spark's support for vectorized UDFs, which are designed to work efficiently in a distributed environment. Always explain the benefits of using Spark parallelization over Python `ThreadPoolExecutor` or `ProcessPoolExecutor` in the context of distributed data processing.