none
Non-ascii characters in data.frame do not print properly RRS feed

  • Question

  • Hi

    I noticed that R does not print non-ascii characters (e.g. Japanese) in data.frame on Windows. Character vectors are printed properly, but they are not when they are in data.frame. I know that this is a bug in the original R, but I hope Microsoft R to fix this problem.

    > c('あいうえお')
    [1] "あいうえお" # OK
    > data.frame(c('あいうえお'))
      c...U.3042..U.3044..U.3046..U.3048..U.304A...
    1      <U+3042><U+3044><U+3046><U+3048><U+304A> # Wrong
    Due to this problem, I encourage my students to use Linux instead of Windows at the moment. But, I will just tell students, who use Windows, to install Microsoft R if it supports non-ascii characters better!





    Monday, September 11, 2017 6:47 AM

All replies

  • Hi there,

    I have the same issue as Koheiw but with Chinese characters. I would like to install Linux but I am working under a corporate laptop and I have to use Windows. 

    I would appreciate if someone can suggest some alternatives as I am not sure how to deal with this. My only option at the moment is running my script and later, vlookup in Excel...............

    Thanks,

     

    Friday, June 15, 2018 12:32 PM
  • This is a known issue on Windows systems with R. For a complete description of the issue you can look at the following post, which also contains a workaround solution:

    people.fas.harvard.edu/~izahn/posts/reading-data-with-non-native-encoding-in-r/

    You can use the R function 'print.listof()' to display the Japanese characters in the dataframe:

    >print.listof(data.frame((y = c('あいうえお')))

    y :
    [1] あいうえお
    Levels: あいうえお

    Stephen Weller

    Friday, June 15, 2018 4:53 PM