-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix utf8 handling in C code #3
Conversation
@@ -22,6 +22,7 @@ Gem::Specification.new do |s| | |||
s.require_paths = ["lib"] | |||
|
|||
s.add_development_dependency "nokogiri", "~> 1.6.0" | |||
s.add_development_dependency "rake", "~> 10.0" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NOTE: I can't build it with the latest rake, so fixed the version to v10.
CI fails because of activesupport dependency. How can I fix it? |
@yujinakayama ping? |
Sorry for the late response. Right now I'm busy but will review in this week 🙏 |
The CI issues are fixed in the current master. Could you please rebase against the master? 🙏 |
@@ -223,7 +223,7 @@ sd_autolink__email( | |||
return 0; | |||
|
|||
for (link_end = 0; link_end < size; ++link_end) { | |||
uint8_t c = data[link_end]; | |||
char c = data[link_end]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand this change forces the c
to be virtually 7-bit ASCII but it's a bit obscure. Maybe the way in vmg#463 and defining a macro something like is_ascii()
(c < 0x7f
) would be more explicit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand this change forces the c to be virtually 7-bit ASCII
No, it might be a part of UTF-8's middle bytes, and the change of the type doesn't change the bit pattern of c
.
Because c
is dealt as char
, it should be char
, not uint8_t
(i.e. unsigned char
). I believe this code is the best.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@yujinakayama ping
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All right 👍
2718458
to
a316a54
Compare
Rebased. |
Thanks! |
Released as v3.2.2.2. |
Thanks! |
Fix #2, which comes in redcarpet.
Some tests are borrowed from vmg#464