Static analysis, part II by alexduf · Pull Request #28816 · guardian/frontend

alexduf · 2026-05-27T12:21:08Z

The frontend repository for the Guardian's website was created in February 2012: 14 years ago.
It has had an increasing number of features and services added, leading this repo becoming pretty big.

Parallel to that, the DCAR project was started with the objective of centralising all the rendering happening at the Guardian.

We're getting closer to all the rendering happening in that new project, but at this stage we find ourselves unable to easily answer a question: what rendering is still happening from within the frontend repo?

More generally, this is the beginning of a larger piece of work: removing unused or undesirable code from this repo, at scale.

Measuring, Measuring, Measuring

The first step to understanding which endpoints are being used, is to look at live traffic data.

This work has been done already and is using the ALB access logs as well as the routes definitions of this application to undesrstand which endpoints are actively in use.

Digging deeper

If we know which endpoints are being used, and to what extent, we can now start looking at which endpoints are rendering HTML content by looking at the code.

On a smaller application, this is something that could have been done manually by reading the code, helped with an IDE.

But this project is at the scale where it makes sense to spend a few days writing the tool to help us do that for us, which explain the existence of this module.

This module leverages the scala-meta library to map all the twirl templates to their corresponding controller(s).

This is done by constructing "call hierarchies" (also known as call graphs), where the root of the tree is a twirl template, and we find all the places where this template is called, recusrively.

The end result is a tree data-structure that describes all the possible code paths to the twirl template across the application. Here's an edited down version to illustrate it:

views/html/fragments/email/emailArticleBody. in article/target/scala-2.13/twirl/main/views/html/fragments/email/emailArticleBody.template.scala:18:7
  views/html/fragments/email/emailArticleBody. in article/app/pages/ArticleEmailHtmlPage.scala:7:35
    No call to views/html/fragments/email/emailArticleBody. found (entry point)
  views/html/fragments/email/emailArticleBody. in article/app/pages/ArticleEmailHtmlPage.scala:20:8
    pages/ArticleEmailHtmlPage.html(). in article/app/controllers/LiveBlogController.scala:50:70
      controllers/LiveBlogController#renderEmail(). in article/target/scala-2.13/routes/main/router/Routes.scala:152:25
        router/Routes# in article/target/scala-2.13/routes/main/router/Routes.scala:41:37
        ...
      controllers/LiveBlogController#renderEmail(). in article/target/scala-2.13/routes/main/router/Routes.scala:188:25
        router/Routes# in article/target/scala-2.13/routes/main/router/Routes.scala:41:37
          No call to router/Routes# found (entry point)
        ...
    pages/ArticleEmailHtmlPage.html(). in article/app/controllers/ArticleController.scala:103:66
      controllers/ArticleController#render(). in article/app/controllers/ArticleController.scala:43:42
        controllers/ArticleController#mapAndRender(). in article/app/controllers/ArticleController.scala:38:4
          controllers/ArticleController#renderItem(). in article/app/controllers/PublicationController.scala:49:26
            controllers/PublicationController#publishedOn(). in article/target/scala-2.13/routes/main/router/Routes.scala:332:28
              router/Routes# in article/target/scala-2.13/routes/main/router/Routes.scala:41:37
                No call to router/Routes# found (entry point)
              ...
            controllers/PublicationController#publishedOn(). in article/target/scala-2.13/routes/main/router/Routes.scala:461:93
              No call to controllers/PublicationController#publishedOn(). found (entry point)
            controllers/PublicationController#publishedOn(). in article/test/PublicationControllerTest.scala:48:39
              test/PublicationControllerTest# in article/test/package.scala:12:10
                No call to test/PublicationControllerTest# found (entry point)
              ... more tests

But why pushing this to the frontend repo?

The work that went into writing this static analysis tool for this one specific question is likely to be re-used when we continue cleaning the codebase from unused features.

For instance, now that we're able to create call hierarchies we could also look at identifying which method could be safely removed once we've decided which controller entrypoints can be dropped.

Once the cleanup is finished, then this module can be entirely dropped, or spun into its own repo.

End result and next steps

This PR proposes a module that is capable of generating a csv that contains the mapping between endpoints (controller methods), and twirl templates.

The next step is to ingest this CSV as well as the aggregated live traffic data into a small local database (sqlite) and to understand:

which endpoints aren't used, and therefore which twirl templates aren't rendering anything
which endpoints are used, and therefore which templates are rendering HTML in production. Then we can decide what to do with this code (port to DCAR, keep, remove feature etc)

About the review of this code

This code does not come with the usual unit tests and general rigour that is expected in production, simply because it is NOT production code. It is only meant to run locally by engineers.

This simplifies a lot of things

- naming - infinite recursion protection - better model - faster semantic db access

- types - naming

github-actions · 2026-05-27T12:28:04Z

Deploy build 7026 of `dotcom:frontend-all` to CODE

All deployment options

From guardian/actions-riff-raff.

alexduf added 14 commits May 22, 2026 14:44

Implement call hierarchy discovery

4568b5e

nicer print

4cf2fae

Wip load all sources

61a2cb8

Wip load all semanticDB

c6efb76

Wip match files by paths

e574e18

Initialise the search from the SemanticDB

15afbb5

Flip the logic upside down

acb8dcd

This simplifies a lot of things

Refactor into a hierarchy builder

3838584

Various improvements:

24c0a91

- naming - infinite recursion protection - better model - faster semantic db access

Various improvements:

966a0d7

- types - naming

Improve resolution

9859e98

Output controller -> twirl mapping

eba8a62

Do the whole of frontend: 1275 mappings

ec8cb8c

Output a CSV

6da670c

alexduf added the maintenance Departmental tracking: maintenance work, not a fix or a feature label May 27, 2026

alexduf marked this pull request as ready for review May 27, 2026 13:04

alexduf requested a review from a team as a code owner May 27, 2026 13:04

alexduf linked an issue May 27, 2026 that may be closed by this pull request

Static analysis: map endpoints to twirl templates #28803

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Static analysis, part II#28816

Static analysis, part II#28816
alexduf wants to merge 14 commits into
mainfrom
adu-static-analysis-II

alexduf commented May 27, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented May 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

alexduf commented May 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Measuring, Measuring, Measuring

Digging deeper

But why pushing this to the frontend repo?

End result and next steps

About the review of this code

Uh oh!

github-actions Bot commented May 27, 2026

Deploy build 7026 of dotcom:frontend-all to CODE

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

alexduf commented May 27, 2026 •

edited

Loading

Deploy build 7026 of `dotcom:frontend-all` to CODE