Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

use the inherit-input-method argument of read-char #858

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

willghatch
Copy link
Contributor

Supercedes #857.

@wasamasa
Copy link
Member

wasamasa commented Jun 22, 2017

I don't quite understand what this fixes and get no different behavior from testing with the "c" interactive code changed. Can you provide a complete repro of an usecase, show how it goes wrong and how it should be? Also, provide details about your setup including the terminal emulator, just as with the bug report template.

edit: From a cursory search, read-char is used for reading markers (which should be alphanumerical), registers (a bit more than alphanumerical) and an unused interactive code. Given that Evil tries emulating Vim as faithfully as possible, I'm inclined to close this in favor of restricting what's read here.

@willghatch
Copy link
Contributor Author

Issue type

  • Bug report

Environment

Emacs version: 26.0.50
Operating System: Arch Linux
Evil version: current master (f371f2c), but also the latest melpa version, and earlier versions
Evil installation type: manual, but also melpa
Graphical/Terminal: Terminal emulator in X11 -- reproducible with xterm, konsole, gnome-terminal, etc
Terminal multiplexer: It happens in tmux as well as in the terminal emulators mentioned above.

Reproduction steps

  • Start Emacs (in terminal mode)
  • Set your keyboard to "eu" with setxkbmap eu, or use any other keyboard layout with non-ascii unicode characters.
  • In insert mode type » (» is typed on the eu layout by pressing right-alt+]). Note that the key is now in the buffer (no mojibake).
  • Go to normal state.
  • Type .
  • Type j or k to move a line. Note that you move, but \273 is put in the buffer.

To more clearly see what is happening, paste in this code:

(defun read-char-test (char)
  (interactive "c")
  (if (equal char (char-after))
      (message "char given (%c) and char-after (%c) are equal" char (char-after))
    (message "char given (%c) and char-after (%c) differ" char (char-after))))
  • Eval that defun.
  • Move your cursor on to the » character.
  • M-x read-char-test, then type »
  • You get the message "char given (Â) and char-after (») differ".
  • Type k to move up a line.
  • Note that \273 is put in the buffer and you move up a line.

Expected behavior

read-char should read the whole » character and not create mojibake.

Actual behavior

read-char is only getting part of the character you type, and then dumping the rest into the buffer as mojibake.

Further notes

This is most saliently an issue for evil-surround, which uses (interactive "c") in some places. I try to use it with non-ascii delimiters like «» and it fails. But this is an issue with anything that uses (interactive "c") and with other things that use (read-char) like setting and going to marks (it's not a big deal for me there, but if I were to, say, remap evil commands to work on a cyrillic or greek keyboard layout it would be a serious problem).

@wasamasa
Copy link
Member

Hm, I can mostly reproduce this, except mojibake being inserted into the buffer, it's only echoed here. For the record, if you try doing the same thing with Vim, it silently ignores and gives you an error if you attempt jumping to it with .

I'm not terribly surprised that reading in a non-ascii character is tricky to do right, I've implemented reading in UTF-8 myself for fun. What surprised me though is that the flag for doing this hides behind the inherit-input-method flag. I'm not using any I'm aware of, yet this makes translation of multibyte input into the correct character work. This conflation of multibyte input and input methods implemented in elisp doesn't sit right with me at all.

@willghatch
Copy link
Contributor Author

Yeah, the conflation doesn't make much sense to me either. But regardless whether or not » and other punctuation should be ignored as a mark name, certainly λ or и should work for marks. Any non-latin alphanumeric character will have the same issue. And personally I care a lot more about (interactive "c") than about marks. read-key gets the right character, but it seems to cause issues (specifically evil-surround not being able to use . to repeat). Perhaps read-key is correct but there is some other bug that interacts with it somehow to cause those issues.

@edkolev
Copy link
Member

edkolev commented Jun 26, 2017

@willghatch will this PR fix the following issue: with non English input method (Bulgarian in my case), f f searches for the letter f instead of ф (for Bulgarian input method)?

Edit: as you've noted, this issue seems related #605

@willghatch
Copy link
Contributor Author

@edkolev I'm afraid not. Doing a little bit of testing, if I'm in input-state and run read-char with inherit-input-method as a true value it does read ф, but if I'm in normal-state it still reads f. I don't know much about the guts of evil-mode, so I couldn't say why.

@ninrod
Copy link
Member

ninrod commented Jun 26, 2017

Just as I did in #857, I'd like to warn that this read-key thing was the main culprit of a very painful regression on evil-surround which I had to revert. It broke the . operation on evil-surround, which is the main feature of evil-surround.

let us be very careful about merging this in.

@willghatch
Copy link
Contributor Author

I've dug in a bit more (because this issue has been bugging me), and found the root cause behind . breaking in evil-surround, and it's a bug in evil-repeat.el. Basically, read-key and read-key-sequence both change the result of future calls to this-command-keys. Evil advises read-key-sequence, but not read-key, which causes the use of read-key to break the key recording for evil-repeat. Ultimately I think #857 was really correct, and that the problems with it are caused by this evil-repeat bug. I'm adding a new PR (#865) addressing this.

@willghatch
Copy link
Contributor Author

Why read-key and read-key-sequence affect this-command-keys and not read-char I'm not sure, but read-key is semantically more correct for the uses that these patches have tried to correct.

@@ -244,7 +244,7 @@ the last column is excluded."

(evil-define-interactive-code "c"
"Read character."
(list (read-char)))
(list (read-char nil t)))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that this cannot be changed as it emulates the standard interactive code 'c' which does not set the inherit-input-method argument, but otherwise I do not see why not to merge this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants