Open
Conversation
…xt to keatsTraining.txt
…d and testTokenizeWithInternalPeriod tests
auberonedu
reviewed
Feb 18, 2025
|
|
||
| Enter the filename: hatsuneMikuTraining.txt | ||
| Enter the number of words to generate: 150 | ||
| c'mon, i nothing but up, up. just take the stars they will be alright there's a rhythm from new york to help us all here and let me hear you and i know we knew how it'd go all along. yeah! we're miles away but up, up. so take the air a doubt. yeah! intergalactic bound whoa oh, whoa oh oh hey! hey! intergalactic bound whoa oh hey! hey! hey! intergalactic bound a wave of the clouds nothing but up, up. yeah! we're never be alright there's a breakthrough that can we knew how it'd go all along. yeah! we're miles away but our side to japan know you'll never coming down. uh huh. yeah! we're never coming down jump, jump without a rhythm from now on. yeah! intergalactic bound a doubt nothing but up, up. nothing but up, up No newline at end of file |
| // Check if string contains a period | ||
| if (word.contains(".")) { | ||
| // Check if period doesn't occur at the end of String | ||
| if (word.indexOf(".") != word.length() - 1) { |
There was a problem hiding this comment.
Nice! You could also use !word.endsWith here
Comment on lines
+19
to
+28
| @Test | ||
| void testTokenizeWithManySpaces() { | ||
| LowercaseSentenceTokenizer tokenizer = new LowercaseSentenceTokenizer(); | ||
|
|
||
| Scanner scanner = new Scanner("hello hi hi hi hello hello"); | ||
|
|
||
| List<String> tokens = tokenizer.tokenize(scanner); | ||
|
|
||
| assertEquals(List.of("hello", "hi", "hi", "hi", "hello", "hello"), tokens); | ||
| } |
Comment on lines
+58
to
+72
| for (int i = 0; i < trainingWords.size(); i++) { | ||
| // Check if neighborMap doesn't contain the current word in trainingWords | ||
| if (!neighborMap.containsKey(trainingWords.get(i))) { | ||
| // If true, create an empty list | ||
| List<String> followingWords = new ArrayList<>(); | ||
| // Nested loop over the rest of the words in trainingWords list | ||
| for (int j = i; j < trainingWords.size(); j++) { | ||
| // Check if the words at indexs i and j in the trainingWords list are the same | ||
| // and that j + 1 is less than the size of the trainingWords list | ||
| if (trainingWords.get(i).equals(trainingWords.get(j)) && (j + 1 < trainingWords.size())) { | ||
| // If true, store the index of the word that comes next and add that word to the | ||
| // followingWords list | ||
| int add = j + 1; | ||
| followingWords.add(trainingWords.get(add)); | ||
| } |
Comment on lines
+133
to
+140
| String currentWord = context.get(context.size() - 1); | ||
|
|
||
| // Generate a random int between 0 and the size of the list associated with | ||
| // currentWord in neighorMap | ||
| int randomNum = (int) (Math.random() * neighborMap.get(currentWord).size()); | ||
|
|
||
| // Return the randomly chosen word | ||
| return neighborMap.get(currentWord).get(randomNum); |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.