-
Notifications
You must be signed in to change notification settings - Fork 140
Fix space issue with ZH ITN #244
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Signed-off-by: Anand Joseph <[email protected]>
Update FST paths Signed-off-by: anand-nv <[email protected]>
|
This PR is stale because it has been open for 14 days with no activity. Remove stale label or comment or update or this will be closed in 7 days. |
tbartley94
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Clarifying questions, but LGTM as long as tests pass
| GraphFst, | ||
| delete_extra_space, | ||
| delete_space, | ||
| delete_zero_or_one_space, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we have a singular delete space that we can just use a kleene question with?
|
|
||
| def __init__(self): | ||
| super().__init__(name="word", kind="classify") | ||
| word = pynutil.insert('name: "') + pynini.closure(NEMO_NOT_SPACE, 1) + pynutil.insert('"') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remind me on this syntax, is the closure(...,1) equivalent to 0 or 1 or match exactly one?
Signed-off-by: Simon Zuberek <[email protected]>
tbartley94
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
* Fix space issue with ZH ITN Signed-off-by: Anand Joseph <[email protected]> * Update Jenkinsfile Update FST paths Signed-off-by: anand-nv <[email protected]> --------- Signed-off-by: Anand Joseph <[email protected]> Signed-off-by: anand-nv <[email protected]> Signed-off-by: Simon Zuberek <[email protected]> Co-authored-by: Anand Joseph <[email protected]> Co-authored-by: anand-nv <[email protected]> Signed-off-by: Namrata Gachchi <[email protected]>
* Fix space issue with ZH ITN Signed-off-by: Anand Joseph <[email protected]> * Update Jenkinsfile Update FST paths Signed-off-by: anand-nv <[email protected]> --------- Signed-off-by: Anand Joseph <[email protected]> Signed-off-by: anand-nv <[email protected]> Signed-off-by: Simon Zuberek <[email protected]> Co-authored-by: Anand Joseph <[email protected]> Co-authored-by: anand-nv <[email protected]>
What does this PR do ?
The fix addresses an incorrectly applied closure over a single character token in the
WORDgrammar, resulting in all strings being classified asWORDSand subsequently skipped for denormalization.Before your PR is "Ready for review"
Pre checks:
git commit -sto sign.pytestor (if your machine does not have GPU)pytest --cpufrom the root folder (given you marked your test cases accordingly@pytest.mark.run_only_on('CPU')).bash tools/text_processing_deployment/export_grammars.sh --MODE=test ...pytestand Sparrowhawk here.__init__.pyfor every folder and subfolder, includingdatafolder which has .TSV files?Copyright (c) 2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved.to all newly added Python files?Copyright 2015 and onwards Google, Inc.. See an example here.try import: ... except: ...) if not already done.PR Type:
If you haven't finished some of the above items you can still open "Draft" PR.