-
Notifications
You must be signed in to change notification settings - Fork 25
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Extract only strings and comments from codeblock #109
Conversation
Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA). View this failed invocation of the CLA check for more information. For the most up to date status, view the checks section at the bottom of the pull request. |
Hi @dalance,
Wow, thank you so much! This looks really good from a quick look! Could you test how https://google.github.io/comprehensive-rust/zh-CN/hello-world/small-example.html looks after this? In particular, does @dyoo has written a lot of this code and I hope he can review this. Also, I think this will collide with #107 and the design he started there to support things like translation comments. So we should make sure we can get everything working nicely together here 🙂 I think we would want an option in |
d78316f
to
2401cec
Compare
Hi @mgeisler, The small-example.md becomes like below.
The conflicting point with #107 seems to be the modifitation to
This PR changes messages.pot significantly. |
2401cec
to
420f768
Compare
I changed some logics to treat continuous line comments. |
420f768
to
e1de210
Compare
Thanks for all the updates!
Yeah, I can imagine it will change the files a lot — but for the better! Please consider adding a line about this new behavior to the documentation. I think it could just be a little paragraph in Marking Sections to be Skipped for Translation which says something like
Something like that, feel free to rephrase it how you like.
Yes, this looks great! Please also try running
and verify that the existing translations are correctly paired up with the new messages. The normalization tool is what is supposed to let us change how we extract messages while also keeping existing translations intact. If I did things correctly, it should "just work", but it'll be nice to have confirmation from you 🙂 |
The result of mdbook-i18n-normalize is below.
|
c7b861f
to
e19b096
Compare
e19b096
to
70ad914
Compare
@henrif75, when this is merged, we will do a 0.3 release. This is a breaking change: we need to run |
I found a normalization issue.
The trailing linebreak is missing. pub fn extract_messages(document: &str) -> Vec<(usize, String)> {
if document.starts_with("//") && document.ends_with("\n") {
return vec![(1, document.to_string())];
} How do you think the workaround? |
70ad914
to
d7fda79
Compare
I fixed the reviewed issues. |
Sounds good. I assume it's going to break ongoing PRs? |
I am so sorry for late review comments; I thought I pressed Submit, and it turns out that I did not. :( |
fb769d7
to
cd1faa1
Compare
I refrected the review, introduced |
My solution seems to be broken... |
Built-in flags seems to be passed through |
If this is only for normalization, then please don't add extra code to support it — normalization is something translators (or @henrif75) will run only once after a new breaking release. I don't think the main code should be expanded to support it. In this case, I believe the problem are some newlines which are incorrect after running normalize? As long as the normal flow works, then these messages will be corrected by @dyoo, are you happy with the logic here? Please finalize the review so we can merge it. |
If there is no need to resolve the normalization issue, I'll remove the logic. |
c821d01
to
a48720a
Compare
I removed the logic related to |
Thanks @dalance!
Hey @henrif75, yes, it will affect PRs that touches code blocks. Most PRs don't, I believe so the impact should be less than when the tooling learnt to ignore most Markdown formatting in version 0.2. Note that there will only be conflicts if we run the normalization tool on a translation — until we do, the translation will work just fine in all places except for the code blocks. So I suggest that we land the current in-flight PRs, run the normalization tool, and then people can continue translating like normal (but with much smaller PO files: a quick test shows that we go from 19k to 16k lines!). |
a48720a
to
2cf67ed
Compare
I restored heuristic logic for codeblock which doesn't have lang-specifier, and refactored stack operation logic. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the changes; looks good to me.
Sorry for the tiny merge conflict in |
2cf67ed
to
eb5cde7
Compare
I resolved the conflict. |
Fixes #95
This PR adds codeblock parsing support to extract only strings and comments.