Skip to content

Static analysis, part II#28816

Open
alexduf wants to merge 14 commits into
mainfrom
adu-static-analysis-II
Open

Static analysis, part II#28816
alexduf wants to merge 14 commits into
mainfrom
adu-static-analysis-II

Conversation

@alexduf
Copy link
Copy Markdown
Contributor

@alexduf alexduf commented May 27, 2026

The frontend repository for the Guardian's website was created in February 2012: 14 years ago.
It has had an increasing number of features and services added, leading this repo becoming pretty big.

Parallel to that, the DCAR project was started with the objective of centralising all the rendering happening at the Guardian.

We're getting closer to all the rendering happening in that new project, but at this stage we find ourselves unable to easily answer a question: what rendering is still happening from within the frontend repo?

More generally, this is the beginning of a larger piece of work: removing unused or undesirable code from this repo, at scale.

Measuring, Measuring, Measuring

The first step to understanding which endpoints are being used, is to look at live traffic data.

This work has been done already and is using the ALB access logs as well as the routes definitions of this application to undesrstand which endpoints are actively in use.

Digging deeper

If we know which endpoints are being used, and to what extent, we can now start looking at which endpoints are rendering HTML content by looking at the code.

On a smaller application, this is something that could have been done manually by reading the code, helped with an IDE.

But this project is at the scale where it makes sense to spend a few days writing the tool to help us do that for us, which explain the existence of this module.

This module leverages the scala-meta library to map all the twirl templates to their corresponding controller(s).

This is done by constructing "call hierarchies" (also known as call graphs), where the root of the tree is a twirl template, and we find all the places where this template is called, recusrively.

The end result is a tree data-structure that describes all the possible code paths to the twirl template across the application. Here's an edited down version to illustrate it:

views/html/fragments/email/emailArticleBody. in article/target/scala-2.13/twirl/main/views/html/fragments/email/emailArticleBody.template.scala:18:7
  views/html/fragments/email/emailArticleBody. in article/app/pages/ArticleEmailHtmlPage.scala:7:35
    No call to views/html/fragments/email/emailArticleBody. found (entry point)
  views/html/fragments/email/emailArticleBody. in article/app/pages/ArticleEmailHtmlPage.scala:20:8
    pages/ArticleEmailHtmlPage.html(). in article/app/controllers/LiveBlogController.scala:50:70
      controllers/LiveBlogController#renderEmail(). in article/target/scala-2.13/routes/main/router/Routes.scala:152:25
        router/Routes# in article/target/scala-2.13/routes/main/router/Routes.scala:41:37
        ...
      controllers/LiveBlogController#renderEmail(). in article/target/scala-2.13/routes/main/router/Routes.scala:188:25
        router/Routes# in article/target/scala-2.13/routes/main/router/Routes.scala:41:37
          No call to router/Routes# found (entry point)
        ...
    pages/ArticleEmailHtmlPage.html(). in article/app/controllers/ArticleController.scala:103:66
      controllers/ArticleController#render(). in article/app/controllers/ArticleController.scala:43:42
        controllers/ArticleController#mapAndRender(). in article/app/controllers/ArticleController.scala:38:4
          controllers/ArticleController#renderItem(). in article/app/controllers/PublicationController.scala:49:26
            controllers/PublicationController#publishedOn(). in article/target/scala-2.13/routes/main/router/Routes.scala:332:28
              router/Routes# in article/target/scala-2.13/routes/main/router/Routes.scala:41:37
                No call to router/Routes# found (entry point)
              ...
            controllers/PublicationController#publishedOn(). in article/target/scala-2.13/routes/main/router/Routes.scala:461:93
              No call to controllers/PublicationController#publishedOn(). found (entry point)
            controllers/PublicationController#publishedOn(). in article/test/PublicationControllerTest.scala:48:39
              test/PublicationControllerTest# in article/test/package.scala:12:10
                No call to test/PublicationControllerTest# found (entry point)
              ... more tests

But why pushing this to the frontend repo?

The work that went into writing this static analysis tool for this one specific question is likely to be re-used when we continue cleaning the codebase from unused features.

For instance, now that we're able to create call hierarchies we could also look at identifying which method could be safely removed once we've decided which controller entrypoints can be dropped.

Once the cleanup is finished, then this module can be entirely dropped, or spun into its own repo.

End result and next steps

This PR proposes a module that is capable of generating a csv that contains the mapping between endpoints (controller methods), and twirl templates.

The next step is to ingest this CSV as well as the aggregated live traffic data into a small local database (sqlite) and to understand:

  • which endpoints aren't used, and therefore which twirl templates aren't rendering anything
  • which endpoints are used, and therefore which templates are rendering HTML in production. Then we can decide what to do with this code (port to DCAR, keep, remove feature etc)

About the review of this code

This code does not come with the usual unit tests and general rigour that is expected in production, simply because it is NOT production code. It is only meant to run locally by engineers.

@alexduf alexduf added the maintenance Departmental tracking: maintenance work, not a fix or a feature label May 27, 2026
@github-actions
Copy link
Copy Markdown
Contributor

@alexduf alexduf marked this pull request as ready for review May 27, 2026 13:04
@alexduf alexduf requested a review from a team as a code owner May 27, 2026 13:04
@alexduf alexduf linked an issue May 27, 2026 that may be closed by this pull request
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

maintenance Departmental tracking: maintenance work, not a fix or a feature

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Static analysis: map endpoints to twirl templates

1 participant