Skip to content

Can we do topic modeling? #5

@dgarijo

Description

@dgarijo

Use case:
based on a software, which other software it is more related to?

How is this done?
1- Calculate topics for corpus based on description (e.g., based on Latent Dirichlet Allocation distance)
2- For each topic, you have the probability of a document to belong to that topic, creating clusters of software.
3- Having a new query (in this case a series of keywords), you would calculate which cluster they are more similar to.

We can also define a metric based on graph similarity (to explore)

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions