+
Skip to content

FvwmForm: process UTF-8 input and paste request in input fields #1212

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 7 commits into
base: main
Choose a base branch
from

Conversation

ONykyf
Copy link
Contributor

@ONykyf ONykyf commented Jun 13, 2025

To make FvwmForm input fields UTF-8 aware, input.size field is duplicated as input.width, the first one containing size in bytes and the latter width in multibyte chars. To allow walking along a string with chars of different lengths (1 to 4 bytes), e.g., for pasting or moving a cursor by arrow keys or by a mouse click, a helper function find_nth_UTF8_char() is introduced.

Fixes #1211

ONykyf added 2 commits June 13, 2025 09:38
To make FvwmForm input fields UTF-8 aware, input.size field is
duplicated as input.width, the first one containing size in bytes
and the latter width in multibyte chars. To allow walking along
a string with chars of different lengths (1 to 4 bytes), e.g.,
for pasting or moving a cursor by arrow keys or by a mouse click,
a helper function find_nth_UTF8_char() is introduced.
@ONykyf ONykyf force-pushed the on/fvwmform-utf8 branch from 1ec6cde to ea426a0 Compare June 13, 2025 09:33
@ONykyf
Copy link
Contributor Author

ONykyf commented Jun 15, 2025

@ThomasAdam @somiaj No more places affected by "multibyteness" found. Testing shows no problems. Please review.

@@ -2135,10 +2143,17 @@ char* find_nth_UTF8_char(char *str, char *before,
}

while (1) {
if (*str == '\0' || (before && str >= before)
if (*str == '\0' || l == 0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 == 0? We can do better than this, @ONykyf

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A simple way to signal that no valid UTF-8 char can be found at the requested position, so no need for additional Boolean flags, and convenient to use in conditions.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A simple way to signal that no valid UTF-8 char can be found at the requested position, so no need for additional Boolean flags, and convenient to use in conditions.

No... it's just unnecessary.

Please remove this instance, and subsequent ones.

pstr = str;
l = 0;
}
else if (l == 0 || i == 0) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ibid.

@@ -947,10 +947,11 @@ static void ct_Input(char *cp)
item->input.blanks = fxmalloc(item->input.width);
for (j = 0; j < item->input.width; j++)
item->input.blanks[j] = ' ';
item->input.buf = strlen(item->input.init_value) + 1;
item->input.buf = strlen(item->input.init_value) + 1; /* room for init value */
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need for this comment.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed.

Comment on lines 1048 to 1053
strcpy(item->input.init_value,cp); /* new initial value in field */
strncpy(item->input.init_value, cp, var_len); /* new initial value */
item->input.init_value[var_len] = '\0';
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd rather see xasprintf or similar used here, rather than a change to strncpy.

Copy link
Contributor Author

@ONykyf ONykyf Jun 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to copy a part of a string up to a certain position (not necessarily to '\0'), so strndup() seems to be the best candidate. In case strndup() fails, the subsequent fxstrdup() will catch this.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ugh.

@ThomasAdam
Copy link
Member

@ONykyf -- Thanks for this. It looks OK, although I'm not going to get time to properly look at it until later on the week.

Other than my very small nits I've identified, I think we can probably improve on the logic in a few areas, but I'll wait until I've had change to go over those properly.

…ocate mouse click faster, and translate labels with Gettext
@ThomasAdam
Copy link
Member

@ONykyf

This PR is getting there, still with comments from me which need addressing.

I'm still not completely happy with the state of things overall though, but it's a convoluted mess and this is probably going to have to be good enough, but I suspect I'll merge this once you've finished addressing things, and then tidy it up afterward.

@ONykyf
Copy link
Contributor Author

ONykyf commented Jun 21, 2025

Hope now the logic of the helper function is more straightforward and manageable enough by whoever will do this in the future.

@ONykyf ONykyf requested a review from ThomasAdam June 21, 2025 17:01
Comment on lines 2138 to 2142
#if 0 /* no UTF-8, single-byte locale */
pstr = str;
str++;
i++;
#else /* parse UTF-8 string */
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

May as well just remove the #if 0/#else stuff...

@@ -2135,10 +2143,17 @@ char* find_nth_UTF8_char(char *str, char *before,
}

while (1) {
if (*str == '\0' || (before && str >= before)
if (*str == '\0' || l == 0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A simple way to signal that no valid UTF-8 char can be found at the requested position, so no need for additional Boolean flags, and convenient to use in conditions.

No... it's just unnecessary.

Please remove this instance, and subsequent ones.

Comment on lines 1048 to 1053
strcpy(item->input.init_value,cp); /* new initial value in field */
strncpy(item->input.init_value, cp, var_len); /* new initial value */
item->input.init_value[var_len] = '\0';
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ugh.

Comment on lines 2122 to 2123
if (num) *num = -1;
if (len) *len = 0;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Style: assignment after if() should be on its own line.

Comment on lines 2133 to 2134
if (num) *num = i - 1;
if (len) *len = l;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ibid.

pstr = str;
l = 0;
}
else if ((l == 0) || (*str == '\0')) { /* invalid UTF-8 char or '\0' */
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We said we weren't going to do this 1 == 0 business....

Comment on lines 2236 to 2237
if (num) *num = i - 1;
if (len) *len = l;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Style, put the conditionals on their own line, after if()...

@ONykyf ONykyf requested a review from ThomasAdam June 21, 2025 18:04
@ONykyf
Copy link
Contributor Author

ONykyf commented Jun 21, 2025

Removed.

Copy link
Member

@ThomasAdam ThomasAdam left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can be squashed into a previous commit as it's not introducing anything new.

@ONykyf ONykyf force-pushed the on/fvwmform-utf8 branch from 3dd4dca to c18fd01 Compare June 21, 2025 18:42
@ONykyf ONykyf requested a review from ThomasAdam June 21, 2025 18:44
@ONykyf
Copy link
Contributor Author

ONykyf commented Jun 28, 2025

@ThomasAdam Unsafe strndup() replaced with fxstrndup() analogous to fxstrdup() but truncated if necessary.

In fact getpwuid.c also contains strndup() alongside fxstrdup(). Maybe replace it there as well?

@ThomasAdam
Copy link
Member

@ThomasAdam Unsafe strndup() replaced with fxstrndup() analogous to fxstrdup() but truncated if necessary.

No thanks. I don't think we need this.

In fact getpwuid.c also contains strndup() alongside fxstrdup(). Maybe replace it there as well?

getpwuid.c is a compat file found elsewhere, it's not changing to fit in with any wrapper functions.

The point of the fx*() wrappers is mostly for portability; fxmalloc() is perhaps the only exception to this.

You may as well just remove this latest commit, and make the following adjustment:

item->input.init_value = fxmalloc(var_l + 1);
strlcpy(...);

I also think we need to look at item->input_init_value vs item->input.value.

@ONykyf ONykyf force-pushed the on/fvwmform-utf8 branch from 7611217 to 53d8c40 Compare June 28, 2025 11:22
@ONykyf
Copy link
Contributor Author

ONykyf commented Jun 28, 2025

@ThomasAdam Unsafe strndup() replaced with fxstrndup() analogous to fxstrdup() but truncated if necessary.
You may as well just remove this latest commit, and make the following adjustment:

item->input.init_value = fxmalloc(var_l + 1);
strlcpy(...);

Done

I also think we need to look at item->input_init_value vs item->input.value.

item->input_init_value is used to restore item->input.value on form restarts.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

FvwmForm input field fails to process multibyte chars
2 participants
点击 这是indexloc提供的php浏览器服务,不要输入任何密码和下载