Add support for per-chapter remarks#408
Conversation
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #408 +/- ##
==========================================
+ Coverage 73.06% 73.10% +0.04%
==========================================
Files 439 439
Lines 36700 36763 +63
Branches 5042 5056 +14
==========================================
+ Hits 26814 26877 +63
Misses 8775 8775
Partials 1111 1111 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Enkidu93
left a comment
There was a problem hiding this comment.
How does this work relate to this PR Matthew is working on: sillsdev/machine.py#285?
@Enkidu93 reviewed 3 files and all commit messages, and made 1 comment.
Reviewable status:complete! all files reviewed, all discussions resolved (waiting on ddaspit).
ddaspit
left a comment
There was a problem hiding this comment.
@ddaspit reviewed 3 files and all commit messages, and made 1 comment.
Reviewable status: all files reviewed, 1 unresolved discussion (waiting on pmachapman).
src/SIL.Machine/Corpora/UpdateUsfmParserHandler.cs line 468 at r1 (raw file):
{ // Add the remarks just after the specified chapter, // skipping any alternate and published chapter numbers
Unless I'm missing something, I don't think this will skip \ca and \cp markers.
c9711e8 to
5acb2fd
Compare
pmachapman
left a comment
There was a problem hiding this comment.
@Enkidu93 I was unaware of Matthew's PR (sorry that is on me, I forgot to look). It is a little different - this PR keeps support for book level remarks, and allows different remarks per chapter. I've commented on his PR, and am happy to change this PR to match his if my implementation is incorrect (as I couldn't find concrete specs I got to this PR via an hour of iteration and testing).
@pmachapman made 2 comments.
Reviewable status: 2 of 3 files reviewed, 1 unresolved discussion (waiting on ddaspit and Enkidu93).
src/SIL.Machine/Corpora/UpdateUsfmParserHandler.cs line 468 at r1 (raw file):
Previously, ddaspit (Damien Daspit) wrote…
Unless I'm missing something, I don't think this will skip
\caand\cpmarkers.
Done. That was a hangover from a previous iteration of this change.
5acb2fd to
4ea8c05
Compare
4ea8c05 to
e234f23
Compare
Enkidu93
left a comment
There was a problem hiding this comment.
No worries. I think we'll probably want some of the changes from both PRs. We'll want your per-chapter remarks and his support for partial USFM. It's probably best to pick machine or machine.py and make all the changes there and then port. I'll leave it to you and Matthew to figure out what makes the most sense in regard to where and who.
@Enkidu93 reviewed 1 file and all commit messages, and made 1 comment.
Reviewable status: all files reviewed, 1 unresolved discussion (waiting on ddaspit).
pmachapman
left a comment
There was a problem hiding this comment.
We'll want your per-chapter remarks and his support for partial USFM
I have ported the partial USFM support from Matthew's PR.
@pmachapman made 1 comment.
Reviewable status: 1 of 4 files reviewed, 1 unresolved discussion (waiting on ddaspit and Enkidu93).
ddaspit
left a comment
There was a problem hiding this comment.
@ddaspit reviewed 4 files and all commit messages, made 1 comment, and resolved 1 discussion.
Reviewable status: all files reviewed, 1 unresolved discussion (waiting on pmachapman).
src/SIL.Machine/Corpora/UsfmTokenizer.cs line 43 at r3 (raw file):
string usfm, bool preserveWhitespace = false, IReadOnlyList<int> filterTokensByChapter = null
I don't want to mess with UsfmTokenizer. I would prefer to perform the filtering in ParatextProjectTextUpdaterBase.
pmachapman
left a comment
There was a problem hiding this comment.
@pmachapman made 1 comment.
Reviewable status: all files reviewed, 1 unresolved discussion (waiting on ddaspit).
src/SIL.Machine/Corpora/UsfmTokenizer.cs line 43 at r3 (raw file):
Previously, ddaspit (Damien Daspit) wrote…
I don't want to mess with
UsfmTokenizer. I would prefer to perform the filtering inParatextProjectTextUpdaterBase.
I have moved FilterTokensByChapter to ParatextProjectTextUpdaterBase, and made it internal so the unit tests that require it can used it.
Enkidu93
left a comment
There was a problem hiding this comment.
@Enkidu93 reviewed 4 files and all commit messages.
Reviewable status: all files reviewed, 1 unresolved discussion (waiting on ddaspit).
Enkidu93
left a comment
There was a problem hiding this comment.
@Enkidu93 made 1 comment.
Reviewable status: all files reviewed, 1 unresolved discussion (waiting on ddaspit).
src/SIL.Machine/Corpora/UsfmTokenizer.cs line 43 at r3 (raw file):
Previously, pmachapman (Peter Chapman) wrote…
I have moved
FilterTokensByChaptertoParatextProjectTextUpdaterBase, and made itinternalso the unit tests that require it can used it.
I put a comment in Matthew's PR regarding this. If he refactors there to use a Memory... updater implementation, you'll want to update accordingly.
ddaspit
left a comment
There was a problem hiding this comment.
@ddaspit reviewed 4 files and all commit messages, made 1 comment, and resolved 1 discussion.
Reviewable status:complete! all files reviewed, all discussions resolved (waiting on pmachapman).
Part of sillsdev/serval#905
This change is