Skip to content

fix for NUTCH-2455 more efficient usage of hostdb in generate#254

Closed
okedoki wants to merge 9 commits intoapache:masterfrom
okedoki:NUTCH-2455
Closed

fix for NUTCH-2455 more efficient usage of hostdb in generate#254
okedoki wants to merge 9 commits intoapache:masterfrom
okedoki:NUTCH-2455

Conversation

@okedoki
Copy link
Contributor

@okedoki okedoki commented Dec 8, 2017

Three questions/modification left open:

  1. In several places we use url.getHost() in the Nutch code, in other we use url.getHost().toLower(). Why?
  2. public static class ScoreHostKeyComparator extends WritableComparator should Implement Raw comparator. If you know how to do it you are welcome to do.
  3. The whole Generator file is to big, it should be spread to several files. Again, if you know how to fix it in a good way, you are welcome.

Loading
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants