A general question I have concerns the large amount of information in the text. How do I weight and make the differentiation between more informative and less informative features?
This would be easier if only single words are involved. For example, we discussed tf-idf in previous weeks. But how to weight when there are relations between words and when data are stored in RDBMS?
A general question I have concerns the large amount of information in the text. How do I weight and make the differentiation between more informative and less informative features?
This would be easier if only single words are involved. For example, we discussed tf-idf in previous weeks. But how to weight when there are relations between words and when data are stored in RDBMS?