Skip to content

prob_meaning.md: assorted corrections #877

@jstac

Description

@jstac

A collection of small issues found in lectures/prob_meaning.md.

  • Exercise pm_ex2 part (a): The question asks for the likelihood function for a sample of length $n$, but the solution provides the likelihood for a single flip ($n=1$). Either the question or the solution should be revised so they match.
  • Exercise pm_ex2 solution, before part (c): The sentence "Now pretend that the true value of $\theta = .4$..." appears twice — once as a standalone line before part (c) and again as the opening of part (c) itself. The first occurrence should be removed.
  • Bayesian section, missing setup before exercise pm_ex2: Bayes' Law, likelihood functions, and posterior distributions are not mentioned in the lecture text before the exercise — they only appear inside the solution. These concepts are already introduced in Probability with Matrices (prob_matrix), so a back-reference (e.g., "Recall the posterior distribution derived using Bayes' Law in {doc}Probability with Matrices <prob_matrix>") before the exercise would bridge the gap without duplicating material.
  • $n$-step posterior formula is used before it's derived: The solution code in part (c) uses $\text{Beta}(\alpha + k, \beta + n - k)$ via form_single_posterior, and all subsequent plots and coverage interval calculations depend on it. But the formal derivation doesn't appear until much later, in the "It is natural to extend the one-step Bayesian update..." section after the exercise. Moving the derivation earlier means that later section also needs to be revised or removed to avoid duplication.
  • Line length throughout code cells: Comments, docstrings, and code lines frequently exceed 80 characters, making them hard to read on the website. Sentences in comments should also start with a capital letter. Affected areas include the Bayesian class docstrings and various plotting cells (e.g., lines 506, 509, 551, 555, 682).
  • Variable naming: Several names are unclear or misleading:
    • ii is used both as a loop index and as a posterior distribution in list comprehensions (e.g., ii.cdf(...), ii.mean()). Use i for indices and post or posterior for distributions.
    • num / num_listn_obs / n_obs_list (it's the number of observations, not a generic number)
    • step_numn_obs (it's not a "step")
    • kkk (no reason for the doubled name)
    • npt, nn, nI → more descriptive names like n_thetas, n_ns, n_Is
    • Khead_counts (capital K looks like a constant)
    • comptable
  • PEP 8 naming conventions:
    • Bay_stat should be bay_stat or bayes — instance names should be snake_case, not PascalCase.
    • frequentist class should be Frequentist — class names should be PascalCase. (The Bayesian class already follows this convention.)
  • Plot label formatting: Line 590 uses 'n=%d thousand' % (num/1000), which produces awkward labels like "n=5 thousand". Use f-string formatting with comma separators instead, e.g., f'Posterior with n={n_obs:,}' → "n=5,000".
  • Typos and spelling:
    • "probabilties" → "probabilities" (line 76)
    • "to to help" → "to help" (line 36, doubled word)
    • "probabililty" → "probability" (lines 381, 546, 555, 564, 568, 572 — same misspelling repeated 6 times)
    • "statististian" → "statistician" (line 690)
  • Notation inconsistency: Line 331 refers to "the average of $P_{k,i}$" but the variable was defined as $\rho_{k,i}$ on line 325.
  • Subject-verb agreement (line 602): "posterior means converges" → "posterior mean converges"; "posterior standard deviations converges" → "posterior standard deviation converges".
  • LaTeX equations use * for multiplication (lines 638, 642, 646): Should use \cdot or juxtaposition instead.
  • Spiky posterior density plots: θ_values = np.linspace(0.01, 1, 100) only provides 100 grid points. The part (h) plot is zoomed to [0.3, 0.5], leaving ~20 points to render highly concentrated posteriors. Increase the grid resolution (e.g., 1000 points).
  • Cross-references say "this quantecon lecture" (lines 719–722): These render poorly in PDF where there's no hyperlink context. Replace with actual lecture titles: Non-Conjugate Priors, Posterior Distributions for AR(1) Parameters, and Forecasting an AR(1) Process.
  • Variance formula is wrong (lines 327–329): The variance of $\rho_{k,i}$ is given as $n \cdot \text{Prob}(X=k|\theta) \cdot (1 - \text{Prob}(X=k|\theta))$, but $\rho_{k,i}$ is a Bernoulli indicator, so its variance is $\text{Prob}(X=k|\theta) \cdot (1 - \text{Prob}(X=k|\theta))$ — no factor of $n$.
  • Upper and lower bounds swapped in part (e) (lines 521–522): ppf(0.05) (the 5th percentile) is assigned to upper_bound and ppf(0.95) (the 95th percentile) to lower_bound. Lines 672–673 do it correctly.
  • compare() skips $k=0$ (line 192): range(n) with i+1 starts at $k=1$, but $k=0$ is a valid binomial outcome.
  • Exercise pm_ex1 part 3 is vague: "With the Law of Large numbers in mind, use your code to say something" — say something about what?
  • Text says $\log(I)$ varies from 2 to 7 (line 286) but code has I_log_high = 6 (line 289).
  • Line 321 is imprecise: Says the observed fraction approximates $\theta$, but $f_k^I$ approximates $\text{Prob}(X=k|\theta)$, not $\theta$ itself.
  • Inconsistent bool-to-int conversion: The frequentist class uses * 1 (line 171) while Bayesian uses .astype(int) (line 455). Should be consistent — .astype(int) is clearer.

All items addressed in #878.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions