Oxygen Basic

Programming => Example Code => General => Topic started by: Arnold on June 22, 2018, 05:12:32 AM

Title: Using OxygenBasic with Unicode
Post by: Arnold on June 22, 2018, 05:12:32 AM
Hi Charles,

unicode strings seem to be a complicated subject. In the meantime I understand that MS Windows OS supports UTF-16 LE, that I can display all kind of characters in some text editors like Notepad or Wordpad, but that my possibilities for input via keyboard are limited and so I need other tricks to achieve this.

As Oxygen supports unicode strings I would like to investigate this feature a bit more and started some tests. I adapted examples\WinGui\HelloWin1.o2bas (without minwin) and it works quite nice:

Code: OxygenBasic
  1.   $ filename "HelloUnicode.exe"
  2.   'uses RTL32
  3.  'uses RTL64
  4.  
  5.   % CS_VREDRAW      1
  6.   % CS_HREDRAW      2
  7.   % IDI_APPLICATION 32512
  8.   % IDC_ARROW       32512
  9.   % WHITE_BRUSH      0
  10.   % SM_CXSCREEN      0
  11.   % SM_CYSCREEN      1
  12.   % WS_OVERLAPPEDWINDOW 0x00cf0000
  13.   % SW_SHOW             5
  14.   % WM_CLOSE            16
  15.   % WM_KEYDOWN          0x100
  16.   % WM_DESTROY          2
  17.   % SW_NORMAL           1
  18.   % WS_VISIBLE          0x10000000
  19.   % WS_CHILD            0x40000000
  20.   % WS_CHILDWINDOW      WS_CHILD
  21.    
  22.  
  23.   type POINT
  24.   long x
  25.   long y
  26.   end type
  27.  
  28.   '28 bytes
  29.  type MSG
  30.   sys   hwnd
  31.   int   message
  32.   sys   wParam
  33.   sys   lParam
  34.   dword time
  35.   POINT pt
  36.   end type
  37.  
  38.   '40 bytes
  39.  type WNDCLASS
  40.   int style
  41.   sys lpfnwndproc
  42.   int cbClsextra
  43.   int  cbWndExtra
  44.   sys hInstance
  45.   sys hIcon
  46.   sys hCursor
  47.   sys hbrBackground
  48.   sys lpszMenuName
  49.   sys lpszClassName
  50.   end type
  51.  
  52.   extern lib "Kernel32.dll"
  53.   ! GetCommandLine         "GetCommandLineW"      '0
  54.  ! GetModuleHandle        "GetModuleHandleW"     '1
  55.  end extern
  56.  
  57.   extern lib "User32.dll"
  58.   ! CreateWindowEx        "CreateWindowExW"       '12
  59.  ! DefWindowProc         "DefWindowProcW"        '4
  60.  ! DispatchMessage       "DispatchMessageW"      '1
  61.  ! GetClientRect                                 '2
  62.  ! GetMessage            "GetMessageW"           '4
  63.  ! GetSystemMetrics                              '1
  64.  ! LoadIcon              "LoadIconW"             '2
  65.  ! LoadCursor            "LoadCursorW"           '2
  66.  ! PostQuitMessage                               '1
  67.  ! RegisterClass         "RegisterClassW"        '1
  68.  ! SendMessage           "SendMessageW"          '4
  69.  ! SetWindowText         "SetWindowTextW"        '2  
  70.  ! ShowWindow                                    '2
  71.  ! TranslateMessage                              '1
  72.  ! UpdateWindow                                  '1
  73.  end extern
  74.  
  75.   extern lib "GDI32.dll"
  76.   ! GetStockObject                                '1
  77.  end extern
  78.  
  79.   declare function WinMain(sys inst, prevInst, asciiz2*cmdline, sys show) as sys
  80.   declare sub getwstrings()
  81.  
  82.   '=========
  83.  'MAIN CODE
  84.  '=========
  85.  
  86.   dim cmdline as asciiz2 ptr, hInstance as sys
  87.   @cmdline=GetCommandLine
  88.   hInstance=GetModuleHandle 0
  89.  
  90.   sys lbl[5]
  91.   wstring hellos[5]
  92.  
  93.   '
  94.  'WINDOWS START
  95.  '=============
  96.  '
  97.  WinMain hInstance,0,cmdline,SW_NORMAL
  98.   end
  99.  
  100.  
  101. function WinMain(sys inst, prevInst, asciiz2*cmdline, sys show) as sys
  102.  
  103.    WndClass wc
  104.    MSG      wm
  105.  
  106.    sys hwnd, Wwd, Wht, Wtx, Wty, Tax
  107. wstring classname="Demo"
  108.  
  109.    wc.style = CS_HREDRAW or CS_VREDRAW
  110.    wc.lpfnWndProc = @WndProc
  111.    wc.cbClsExtra =0
  112.    wc.cbWndExtra =0
  113.    wc.hInstance =inst
  114.    wc.hIcon=LoadIcon 0, IDI_APPLICATION
  115.    wc.hCursor=LoadCursor 0,IDC_ARROW
  116.    wc.hbrBackground = GetStockObject WHITE_BRUSH
  117.    wc.lpszMenuName =null
  118.    wc.lpszClassName = strptr classname
  119.  
  120.    RegisterClass (@wc)
  121.  
  122.    Wwd = 320 : Wht = 200
  123.    Tax = GetSystemMetrics SM_CXSCREEN
  124.    Wtx = (Tax - Wwd) /2
  125.    Tax = GetSystemMetrics SM_CYSCREEN
  126.    Wty = (Tax - Wht) /2
  127.  
  128.    hwnd = CreateWindowEx 0,wc.lpszClassName,  wstring("OXYGEN BASIC"),WS_OVERLAPPEDWINDOW,Wtx,Wty,Wwd,Wht,0,0,inst,0
  129.    lbl[1]=CreateWindowEx(0, wstring("static"), wstring("Hello World"),WS_CHILDWINDOW|WS_VISIBLE, 20, 10,200,25, hwnd,0,inst,0)
  130.    lbl[2]=CreateWindowEx(0, wstring("static"), "",WS_CHILDWINDOW|WS_VISIBLE, 20, 35,200,25, hwnd,0,inst,0)
  131.    lbl[3]=CreateWindowEx(0, wstring("static"), "",WS_CHILDWINDOW|WS_VISIBLE, 20, 60,200,25, hwnd,0,inst,0)
  132.    lbl[4]=CreateWindowEx(0, wstring("static"), "",WS_CHILDWINDOW|WS_VISIBLE, 20, 85,200,25, hwnd,0,inst,0)
  133.    lbl[5]=CreateWindowEx(0, wstring("static"), "",WS_CHILDWINDOW|WS_VISIBLE, 20,110,200,25, hwnd,0,inst,0)
  134.  
  135.    getwstrings()
  136.  
  137.    SetWindowText(lbl[2], hellos[4])   'Grecian      
  138.   SetWindowText(lbl[3], hellos[3])   'Japanese    
  139.   SetWindowText(lbl[4], hellos[2])   'Korean    
  140.   SetWindowText(lbl[5], hellos[1])   'Armenian          
  141.  
  142.    ShowWindow hwnd,SW_SHOW
  143.    UpdateWindow hwnd
  144.  
  145.    sys bRet
  146.    do while bRet := GetMessage (@wm, 0, 0, 0)
  147.      if bRet = -1 then
  148.        'show an error message
  149.     else
  150.        TranslateMessage @wm
  151.        DispatchMessage @wm
  152.      end if
  153.    wend
  154.  
  155. end function
  156.  
  157.  
  158. function WndProc ( hWnd, wMsg, wParam, lparam ) as sys callback
  159.  
  160.    select wMsg
  161.            
  162.       case WM_DESTROY
  163.          PostQuitMessage 0
  164.  
  165.       case WM_KEYDOWN
  166.          Select wParam
  167.             Case 27 : SendMessage hwnd, WM_CLOSE, 0, 0      'ESCAPE
  168.         End Select
  169.  
  170.       case else
  171.          function=DefWindowProc hWnd,wMsg,wParam,lParam
  172.  
  173.    end select
  174.  
  175. end function ' WndProc
  176.  
  177. sub getwstrings()
  178.    'very simple
  179.   int n=0
  180.    int p1=1, p2, le
  181.    wstring wcrlf=wchr(13)+wchr(10)
  182.  
  183.    wstring txt=(wstring) getfile "multi.txt"
  184.    if len(txt)=0 then mbox "Cannot load multi.txt"
  185.  
  186.    while 1
  187.      p2=instr(p1,txt, wcrlf)
  188.      if p2=0 then exit while
  189.      le=p2-p1+1
  190.      n+=1
  191.      hellos[n]=mid(txt,p1,le)
  192.      p1=p2+2     'wcrlf'
  193.   wend
  194. end sub
  195.  

There are two items where I am not sure:

In function WinMain I used: asciiz2* cmdline. Is this logic correct?
I had to use: wstring classname="Demo" in order to succeed with strptr classname. Does Oxygen support something similiar like wstrptr "Demo"?

My first impression is that functions like len, instr, mid work very well with wstrings.

Roland
Title: Re: Using OxygenBasic with Unicode
Post by: Charles Pegge on June 22, 2018, 06:02:02 AM
Thank you, Roland.

I will include this demo in examples\WideChars (formerly WideStrings)

asciiz2, zstring2 and wchar are the same type, and the o2 core string functions are aware of character width.

There are schemes for 32bit characters. It would be quite easy to support them, should they ever come into common use.
Title: Re: Using OxygenBasic with Unicode
Post by: Arnold on June 23, 2018, 09:06:34 AM
This is another interesting experiment with unicode. I put together two alphabets which I can input with my keyboard (English, German), and three alphabets which I can only create fully by using keycodes (French, Grecian, Russian). I do not know if the used keycodes are always correct.

I also tried to use the WinApi functions CharLower and CharUpper, but these functions seem not always work correctly. (if my keycodes were correct).

If I understand this correctly then it should be possible to fill the wstrings MyAlphabet_U /_L  with the characters of the language which is supported by the keyboard. Unfortunately I cannot test these cases.

Code: OxygenBasic
  1. $ filename "Alphabets.exe"
  2. 'use rtl32
  3. 'use rtl64
  4.  
  5. extern lib "User32.dll"
  6. ! CharUpper "CharUpperW"                        '1
  7. ! CharLower "CharLowerW"                        '1
  8. ! MessageBox            "MessageBoxW"           '1
  9. end extern
  10.  
  11. wstring tab=wchr(9)
  12. wstring sep
  13.  
  14. function setup_alphabet(int length_U, wstring lang_U, lang_L) as wstring
  15.    int n=0, x
  16.    int le=length_U
  17.    wstring lang
  18.    wstring capital=space 1, small=space 1
  19.    wstring capital2=space 1, small2=space 1  
  20.  
  21.    for x=1 to le
  22.       n+=1
  23.       sep=") " : if n<10 then sep=")  "
  24. '      lang &= n & sep & mid(lang_U,n,1) & tab & mid(lang_L,n,1) & tab
  25.  
  26.       'Testing CharUpper and CharLower, ucase and lcase functions
  27.      capital=mid(lang_L,n,1) : CharUpper(capital)
  28.       small=mid(lang_U,n,1) : CharLower(small)
  29.       capital2=mid(lang_L,n,1) : capital2=ucase(capital2)
  30.       small2=mid(lang_U,n,1) : small2=lcase(small2)
  31.       lang &= n & sep & mid(lang_U,n,1) & capital & capital2 & tab & mid(lang_L,n,1) & small & small2 & tab      
  32.    next
  33.    return lang
  34. end function
  35.  
  36. wstring English_U="ABCDEFGHIJKLMNOPQRSTUVWXYZ"
  37. wstring English_L="abcdefghijklmnopqrstuvwxyz"
  38.  
  39. wstring English=setup_alphabet(len(English_U), English_U, English_L)
  40. MessageBox 0, English ,wstring("English Alphabet"),0
  41.  
  42.  
  43. wstring French_U=
  44. wchr( 65)+wchr(192)+wchr(194)+wchr(198)+wchr( 66)+wchr( 67)+wchr(199)+wchr( 68)+wchr( 69)+wchr(200)+
  45. wchr(201)+wchr(202)+wchr(203)+wchr( 70)+wchr( 71)+wchr( 72)+wchr( 73)+wchr(206)+wchr(207)+wchr( 74)+
  46. wchr( 75)+wchr( 76)+wchr( 77)+wchr( 78)+wchr( 79)+wchr(212)+wchr(338)+wchr( 80)+wchr( 81)+wchr( 82)+
  47. wchr( 83)+wchr( 84)+wchr( 85)+wchr(217)+wchr(219)+wchr(220)+wchr( 86)+wchr( 87)+wchr( 88)+wchr( 89)+
  48. wchr(376)+wchr( 90)
  49. wstring French_L=  
  50. wchr( 97)+wchr(224)+wchr(226)+wchr(230)+wchr( 98)+wchr( 99)+wchr(231)+wchr(100)+wchr(101)+wchr(232)+
  51. wchr(233)+wchr(234)+wchr(235)+wchr(102)+wchr(103)+wchr(104)+wchr(105)+wchr(238)+wchr(239)+wchr(106)+
  52. wchr(107)+wchr(108)+wchr(109)+wchr(110)+wchr(111)+wchr(244)+wchr(339)+wchr(112)+wchr(113)+wchr(114)+
  53. wchr(115)+wchr(116)+wchr(117)+wchr(249)+wchr(251)+wchr(252)+wchr(118)+wchr(119)+wchr(120)+wchr(121)+
  54. wchr(255)+wchr(122)
  55.  
  56. wstring French=setup_alphabet(len(French_U), French_U, French_L)
  57. MessageBox 0, French ,wstring("French Alphabet"),0
  58.  
  59.  
  60. wstring Grecian_U=
  61. wchr(913)+wchr(914)+wchr(915)+wchr(916)+wchr(917)+wchr(918)+wchr(919)+wchr(920)+wchr(921)+wchr(922)+
  62. wchr(923)+wchr(924)+wchr(925)+wchr(926)+wchr(927)+wchr(928)+wchr(929)+wchr(931)+wchr(932)+wchr(933)+
  63. wchr(934)+wchr(935)+wchr(936)+wchr(937)+wchr(938)+wchr(939)
  64. wstring Grecian_L=
  65. wchr(945)+wchr(946)+wchr(947)+wchr(948)+wchr(949)+wchr(950)+wchr(951)+wchr(952)+wchr(953)+wchr(954)+
  66. wchr(955)+wchr(956)+wchr(957)+wchr(958)+wchr(959)+wchr(960)+wchr(961)+wchr(963)+wchr(964)+wchr(965)+
  67. wchr(966)+wchr(967)+wchr(968)+wchr(969)+wchr(970)+wchr(971)
  68.  
  69. wstring Grecian=setup_alphabet(len(Grecian_U), Grecian_U, Grecian_L)
  70. MessageBox 0, Grecian ,wstring("Grecian Alphabet"),0
  71.  
  72.  
  73. wstring Russian_U=
  74. wchr(1040)+wchr(1041)+wchr(1042)+wchr(1043)+wchr(1044)+wchr(1045)+wchr(1025)+wchr(1046)+
  75. wchr(1047)+wchr(1048)+wchr(1049)+wchr(1050)+wchr(1051)+wchr(1052)+wchr(1053)+wchr(1054)+
  76. wchr(1055)+wchr(1056)+wchr(1057)+wchr(1058)+wchr(1059)+wchr(1060)+wchr(1061)+wchr(1062)+
  77. wchr(1063)+wchr(1064)+wchr(1065)+wchr(1066)+wchr(1067)+wchr(1068)+wchr(1069)+wchr(1070)+wchr(1071)
  78. wstring Russian_L=
  79. wchr(1072)+wchr(1073)+wchr(1074)+wchr(1075)+wchr(1076)+wchr(1077)+wchr(1105)+wchr(1078)+
  80. wchr(1079)+wchr(1080)+wchr(1081)+wchr(1082)+wchr(1083)+wchr(1084)+wchr(1085)+wchr(1086)+
  81. wchr(1087)+wchr(1088)+wchr(1089)+wchr(1090)+wchr(1091)+wchr(1092)+wchr(1093)+wchr(1094)+
  82. wchr(1095)+wchr(1096)+wchr(1097)+wchr(1098)+wchr(1099)+wchr(1100)+wchr(1101)+wchr(1102)+wchr(1103)
  83.  
  84. wstring Russian=setup_alphabet(len(Russian_U), Russian_U, Russian_L)
  85. MessageBox 0, Russian ,wstring("Russian Alphabet"),0
  86.  
  87. ==============================================================
  88.  
  89. 'This is for testing the own keyboard
  90. wstring MyAlphabet_U="ABCDEFGHIJKLMNOPQRSTUVWXYZÄÖÜß"
  91. wstring MyAlphabet_L="abcdefghijklmnopqrstuvwxyzäöüß"
  92.  
  93. wstring MyAlphabet=setup_alphabet(len(MyAlphabet_U), MyAlphabet_U, MyAlphabet_L)
  94. MessageBox 0, MyAlphabet ,wstring("My Alphabet - Testing the own keyboard"),0
  95.  
Title: Re: Using OxygenBasic with Unicode
Post by: Arnold on June 25, 2018, 12:34:36 AM
Hello,

I found some Unicode tables and corrected my keycodes a little bit. CharUpper and CharLower now seem to work correctly for wstrings, but lcase and ucase do not. I replaced the code in my previous post.

Roland
Title: Re: Using OxygenBasic with Unicode
Post by: Arnold on June 26, 2018, 02:20:52 AM
Hello,

unicode is really complicated. Displaying the characters of a different language in a console is dependant of a font and a code page. With my pc I use codepage 850. To display some of the text of \examples\WideStrings\Multilingual.txt I must change the font in cmd.exe via properties/font to Lucida Console. This way I can see Latin, Greek and Cyrillic in my console if I use the type command. It is possible to add more fonts to cmd.exe by using regedit.

Checking the lines of \examples\WideStrings\Multilingual.txt with Wordpad.exe I found these fonts:

MS PGothic / Japanese.
MS PGothic / Cyrillic
MS PGothic / Grecian
Mangal
Mangal / Western
Sylfaen
Sylfaen / Cyrillic
Sylfaen
Gulim
Gulim / Hangul

So my conclusion is that the basic Latic charset can be displayed with most fonts, but each font only supports special subsets of unicode.

I also learned that the input methods using the console are still more complicated. Most probably it will be easier to use only the language/region specific characters which are already supported by default.

Roland
Title: Re: Using OxygenBasic with Unicode
Post by: Charles Pegge on June 26, 2018, 02:55:34 AM
Hi Roland,

Thanks for your latest 'Alphabets'. I will include it in the next update (examples\WideChars). I wonder if it is possible to deploy 3d unicode typefaces in Opengl?
Title: Re: Using OxygenBasic with Unicode
Post by: Arnold on June 26, 2018, 07:20:45 AM
Hi Charles,

Nehe tutorials 13,14,15 use wglUseFontBitmapsA and wglUseFontOutlinesA for building the fonts. There are also the unicode functions available:
https://docs.microsoft.com/de-de/windows/desktop/api/wingdi/nf-wingdi-wglusefontbitmapsw

So I assume if using the adequate true type font it should be possible to build a font in OpenGL with unicode characters, create a wstring and display it.

But at the moment I am far away from trying this.
Title: Re: Using OxygenBasic with Unicode
Post by: Charles Pegge on June 27, 2018, 04:36:56 AM
Thanks, Roland.

I'm familiar with the font outlines technology, and it is deployed in OpenglSceneFrame. I see we have easy access to a variety of MS character sets through the LOGFONT structure, but I don't know how these are mapped into Unicode. Anyway, not to worry. It's mostly idle curiosity.
Title: Re: Using OxygenBasic with Unicode
Post by: Arnold on June 28, 2018, 07:52:58 AM
Hi Charles,

this is a small test in order to input and display unicode text in a console window. Basically the app works in 32-bit, but there is a problem if I compile to a 64-bit exe: the input function  for unicode only accepts one character. The function is similiar to input() of console.inc and examples\System\CustomConsole.o2bas. I am not sure what I can do. I tried some variations but they do not work. Is there something which I missed?

Roland
Code: OxygenBasic
  1. $ filename "TestUnicodeConsole.exe"
  2. 'uses RTL32
  3. 'uses RTL64
  4.  
  5. uses console
  6.  
  7. % LF_FACESIZE=32
  8. % FF_DONTCARE=0
  9. % FW_NORMAL=400
  10. % TMPF_TRUETYPE=4
  11.  
  12. type CONSOLE_FONT_INFOEX
  13.   ulong cbSize
  14.   dword nFont
  15.   COORD dwFontSize
  16.   uint  FontFamily
  17.   uint  FontWeight
  18.   wchar FaceName[LF_FACESIZE]
  19. end type
  20.  
  21. extern lib "kernel32.dll"
  22. ! SetConsoleTitleW             '1
  23. ! ReadConsoleW                 '5
  24. ! WriteConsoleW                '5
  25. ! SetConsoleCP                 '1 -Codepage for Input
  26. ! GetConsoleCP                 '0
  27. ! SetConsoleOutputCP           '1
  28. ! GetConsoleOutputCP           '0
  29. ! SetCurrentConsoleFontEx      '3 -Vista and later
  30. ! GetCurrentConsoleFontEx      '3 -Vista and later
  31. end extern
  32.  
  33.   Function woutput(wstring bufout)
  34.   ==============================
  35.   static long buflen,bufrit
  36.   buflen=len(bufout)
  37.   WriteConsoleW ConsOut,strptr bufout,buflen,bufrit,null
  38.   End Function
  39.  
  40.   function winput() As wstring
  41.   ==========================
  42.   static int blen      
  43.   static zstring2 bufin[0x1000]      
  44.   ReadConsoleW ConsIn,strptr bufin,0xfff,@blen,null
  45.   return left bufin,blen
  46.   end function
  47.  
  48.   function wwait() as int
  49.   ===========================
  50.   word bufin[4]
  51.   dword blen, mode
  52.   GetConsoleMode ConsIn,mode
  53.   SetConsoleMode ConsIn,0
  54.   ReadConsoleW ConsIn,@bufin,1, @blen,0
  55.   SetConsoleMode ConsIn,mode
  56.   function=bufin
  57.   end function
  58.  
  59.   wstring wcrlf=wchr(13)+wchr(10)
  60.   wstring wtab=wchr(9)
  61.  
  62.   'wprint override
  63.  def wprint   woutput
  64.  
  65. =========================================================
  66.  
  67.  
  68. 'russian
  69. wstring title="OxygenBasic Unicode Test: "+
  70.       wchr(1088)+wchr(1091)+wchr(1089)+wchr(1089)+wchr(1082)+wchr(1080)+wchr(1081)
  71.  
  72. 'Ansi
  73. SetConsoleTitle title
  74.  
  75. printl "Ansi:"
  76. int cp=GetConsoleOutputCP
  77. printl "Codepage = " cp
  78. printl "Title: " title
  79. printl : printl
  80. print "What is your name? "
  81. wstring name = input()
  82. print  "Ansi:    " name
  83. wprint "Unicode: " name
  84.  
  85. printl
  86. printl "Enter ..." : wwait
  87.  
  88.  
  89. 'Unicode
  90. SetConsoleTitleW title
  91.  
  92. wstring fontname
  93. fontname="Lucida Console"
  94. CONSOLE_FONT_INFOEX cfi
  95. cfi.cbsize=sizeof(CONSOLE_FONT_INFOEX)
  96. cfi.FontFamily=TMPF_TRUETYPE
  97. cfi.FaceName=fontname
  98.  
  99. SetCurrentConsoleFontEx ConsOut,true , cfi
  100. GetCurrentConsoleFontEx ConsOut, true, cfi
  101.  
  102. printl
  103. printl "Changed Font to " cfi.FaceName
  104.  
  105. printl
  106. printl  "Ansi:    " title
  107. printl
  108. wprint  "Unicode: " title
  109. printl
  110.  
  111. printl "your name? "
  112. name = winput()
  113.  
  114. printl "Your name (Ansi)    = " name
  115. wprint "Your name (Unicode) = " name
  116. printl
  117.  
  118. printl "Russion Alphabet"
  119. wstring codes
  120. int x
  121. for x=1040 to 1103
  122.   codes &= wchr(x) & ","
  123. next
  124. wprint codes
  125.  
  126. printl
  127. printl "Enter ..." : wwait()
  128.  
  129. cls
  130. fontname="Terminal"
  131. CONSOLE_FONT_INFOEX cfi2
  132. cfi2.cbsize=sizeof(CONSOLE_FONT_INFOEX)
  133. cfi2.FaceName= fontname
  134.  
  135. SetCurrentConsoleFontEx ConsOut, true, cfi2
  136. GetCurrentConsoleFontEx ConsOut, true, cfi2
  137.  
  138. printl "Back to " cfi2.FaceName
  139. printl
  140. printl "Russian alphabet (Ansi)"
  141. printl codes
  142. printl "Russian alphabet (Unicode)"
  143. printl
  144. wprint codes
  145.  
  146. printl
  147. printl "Enter ..." : wwait
  148.  
Title: Re: Using OxygenBasic with Unicode
Post by: Charles Pegge on June 28, 2018, 08:27:05 PM
Hi Roland,

The problem occurs in GetKey which disrupts further wInput calls....

This works:

Code: [Select]
  function wwait() as int
  ===========================
  word bufin[4]
  dword blen, mode
  GetConsoleMode ConsIn,mode
  SetConsoleMode ConsIn,0
  ReadConsoleW ConsIn,@bufin,1, @blen,0
  SetConsoleMode ConsIn,mode
  function=bufin
  end function
Title: Re: Using OxygenBasic with Unicode
Post by: Arnold on June 29, 2018, 03:12:30 AM
Hi Charles,

thank you for the code. I replaced the code of the previous post with adding the wwait() function and removed the commented winput functions.
Unfortunately the app still only works correctly in 32-bit mode. The winput() function is derived from input() of console.inc, the difference is using ReadConsole instead of ReadFile.

But searching in Internet it seems that others have problems too with using ReadConsole in Win64. Some more investigation is necessary.

Roland
Title: Re: Using OxygenBasic with Unicode
Post by: Arnold on June 29, 2018, 04:20:00 AM
Oh dear! Sometimes I am a bit awkward. I must replace all waitkey with wwait! Then everything works fine as well as in 32-bit as in 64-bit.
Title: Re: Using OxygenBasic with Unicode
Post by: Arnold on June 29, 2018, 05:58:12 AM
Hello,

this might be interesting:

I opened examples\WideStrings\Multilingual.txt with notepad.exe, started \examples\WinDynDialogs\RichEdit2.o2bas and copied the contents of Multilingual.txt into the RichEdit control. To my surprise this worked very fine.

To enter text in Ansi is also no problem.

Using the combination of Alt+keycode for Unicode chars input works too.

Therefore it will be interesting to see how the WinApi unicode functions will work with RichEdit and what must be done to stream in unicode text.

Roland
Title: Re: Using OxygenBasic with Unicode
Post by: Charles Pegge on June 29, 2018, 06:49:23 AM
I'll alter getkey in console.inc. By removing the default param, it will be possible to override getkey in your Unicode console extension. Then wait, waitkey and pause will be useable.

in console.inc
Code: [Select]
  function GetKey() as int
  ========================
  static long blen,mode
  static byte z[4]
  GetConsoleMode ConsIn,mode
  SetConsoleMode ConsIn,0
  ReadConsole ConsIn,@z,1,blen,null
  SetConsoleMode ConsIn,mode
  return z
  end function

unicode override (wwait)
Code: [Select]
  'UNICODE OVERRIDE
  function GetKey() as int
  ========================
  word bufin[4]
  dword blen, mode
  GetConsoleMode ConsIn,mode
  SetConsoleMode ConsIn,0
  ReadConsoleW ConsIn,@bufin,1, @blen,0
  SetConsoleMode ConsIn,mode
  function=bufin
  end function
Title: Re: Using OxygenBasic with Unicode
Post by: Arnold on July 01, 2018, 12:12:24 AM
Hi Charles,

would it be possible to use getfile with filenames of type wstring? It seems that there is also a difference between FAT12, FAT16, and FAT32 file systems.

If nothing else is possible then I will use getfile / putfile with filename as string. But maybe there is also an approach for unicode?

Roland

Code: [Select]
'https://docs.microsoft.com/de-de/windows/desktop/Intl/character-sets-used-in-file-names

string fname="multilingual.txt"
wstring s= (wstring) getfile(fname)
print len s
print s

wstring fn="multilingual.txt"
wstring txt= (wstring) getfile(fn)
print len txt
print txt
Title: Re: Using OxygenBasic with Unicode
Post by: Charles Pegge on July 01, 2018, 07:26:45 AM
Hi Roland,

Using _wfopen from msvcrt.inc.

Code: [Select]
uses corewin

function wGetFile(wstring name) as wstring
==========================================
sys f
int e
wstring m="rb"
bstring2 s
f=_wfopen name,m      'open for reading binary
fseek f,0,2           'end of file
e=ftell f             'get position
fseek f,0,0           'beginning of file
strptr s=getmemory e  'create buffer to fit
fread s,1,e,f         'load buffer
fclose f              'close file
return s
end function


function wPutFile(wstring name, wstring s) as int
=================================================
sys f
wstring m="wb"
int e=len s
f=_wfopen name,m      'open for writing binary
e=fwrite s,1,e*2,f    'save data
fclose f              'close file
return e
end function
'
'TESTS
======
'
wstring s=wgetfile "Multilingual.txt"
print s
wputfile "t.txt",s
Title: Re: Using OxygenBasic with Unicode
Post by: Arnold on July 01, 2018, 11:17:32 AM
Hi Charles,

thank you for the help. I think the functions are useful. By applying them I noticed that wGetFile will crash if I use an invalid filename. I inserted this line:
if f=0 then s="" : return s

but I do not know if this is sufficient; I do not know the effect on bstring2 s.

I also found that I must also load / save strings, so I added overloaded functions. Probably they can be combined with a macro, but I am not very good in this. So this would be my temporary solution:

Code: [Select]
$ filename "UnicodeFilename.exe"
'uses rtl32
'uses rtl64

uses corewin

function wGetFile(wstring name) as wstring
==========================================
sys f
int e
wstring m="rb"
bstring2 s
f=_wfopen name,m      'open for reading binary
if f=0 then s="" : return s
fseek f,0,2           'end of file
e=ftell f             'get position
fseek f,0,0           'beginning of file
strptr s=getmemory e  'create buffer to fit
fread s,1,e,f         'load buffer
fclose f              'close file
return s
end function

function wGetFile(wstring name) as string
==========================================
sys f
int e
wstring m="rb"
bstring s
f=_wfopen name,m      'open for reading binary
if f=0 then s="" : return s
fseek f,0,2           'end of file
e=ftell f             'get position
fseek f,0,0           'beginning of file
strptr s=getmemory e  'create buffer to fit
fread s,1,e,f         'load buffer
fclose f              'close file
return s
end function


function wPutFile(wstring name, wstring s) as int
=================================================
sys f
wstring m="wb"
int e=len s
f=_wfopen name,m      'open for writing binary
e=fwrite s,1,e*2,f    'save data
fclose f              'close file
return e
end function

function wPutFile(wstring name, string s) as int
=================================================
sys f
wstring m="wb"
int e=len s
f=_wfopen name,m      'open for writing binary
e=fwrite s,1,e,f      'save data
fclose f              'close file
return e
end function

'
'TESTS
======

string fname="multilingual.txt"
wstring s= (wstring) wgetfile(fname)
print len s
print s
wputfile "t1.txt",s

wstring fn="multilingual.o2bas"
string txt= wgetfile(fn)
print len txt
print txt

wputfile "t2.txt",txt
Title: Re: Using OxygenBasic with Unicode
Post by: Charles Pegge on July 02, 2018, 01:38:31 PM
Hi Roland,

This has macros to select Unicode file i/o on the basis of the data string type using widthof()

Code: [Select]
uses corewin

function wGetFile(wstring n) as wstring
======================================
sys f
int e
wstring m="rb"
bstring2 s
f=_wfopen n,m         'open for reading binary
if not f then return ""
fseek f,0,2           'end of file
e=ftell f             'get position
fseek f,0,0           'beginning of file
strptr s=getmemory e  'create buffer to fit
fread s,1,e,f         'load buffer
fclose f              'close file
return s
end function


function wGetFile(wstring n,*s)
==============================
s=wGetFile n
end function


function wPutFile(wstring n,s) as int
====================================
sys f
wstring m="wb"
int e=len s
f=_wfopen n,m         'open for writing binary
if not f then return 0
e=fwrite s,1,e*2,f    'save data
fclose f              'close file
return e
end function
'
macro ugetfile (f,s)
===================
#if widthof(s)=2
  wgetfile(f,s)
#else
  getfile(f,s)
#endif
end macro
'
macro uputfile (f,s)
===================
#if widthof(s)=2
  wputfile(f,s)
#else
  putfile(f,s)
#endif
end macro
'
'
'TESTS
======
'
wstring f="t.txt"
wstring s
's=getfile f
ugetfile(f,s)
print s
f="s.txt"
uputfile(f,s)
Title: Re: Using OxygenBasic with Unicode
Post by: Arnold on July 03, 2018, 12:13:51 AM
Hi Charles,

these are my tests:

string fname="multilingual.txt"
wstring s= getfile(fname)
ugetfile(fname,s)
print len s
print s
uputfile ("t1.txt",s)

wstring fn="multilingual.o2bas"
wstring txt
txt=getfile(fn)
ugetfile(fn,txt)
print len txt
'print txt
uputfile ("t2.txt",txt)

print txt displays the string as unicode chars, but t2.txt is saved correctly. I certainly can use any of the approaches above for my REdit experiments.

By mistake in the beginning I used class "Richedit20A" which makes a difference if I use RegisterClassW for an unicode window. If I load an ansi text I must translate it to unicode.
Currently I can open ansi, UTF-8, UTF-16 files. Right now, I am investigating if it is possible to display an RTF file too. Otherwise I will omit this option.

Roland

Edit: it seems I can open and display .rtf files in every mode. I am a bit confused.

Title: Re: Using OxygenBasic with Unicode
Post by: Arnold on July 03, 2018, 08:27:33 AM
Hi Charles,

this is my result for creating a richedit control using the unicode functions. This is not intended to be an editor, only to show the different effects on loading a file as ansi, utf16, utf8, rtf or plain text. I used your initial wGetFile method, I did not test the later version.

Attached is a zip file with the source code and three testfiles in different formats, but the app can be tested with any type of text.

This will end my basic tests with unicode. I assume all the necessary tools are available to start unicode programming seriously?

Roland

Edit: Although I did not add a right-click popup menu, it should be possible to use Ctrl-C / copy selected text, Ctrl-V / paste text into RichEdit, Ctrl-Z / Undo and other short-cuts.

Code: OxygenBasic
  1. 'https://docs.microsoft.com/de-de/windows/desktop/Controls/rich-edit-controls
  2.  
  3. $ filename "RicheditUnicode.exe"
  4.   'uses RTL32
  5.  'uses RTL64
  6.  
  7.    uses WinData
  8.  
  9.    % COLOR_WINDOW = 5
  10.    % ES_SAVESEL=0x8000
  11.    % SWP_NOZORDER=4
  12.    % SS_LEFT=0
  13.    % SS_NOTIFY=0x0100  
  14.    % SF_TEXT=1
  15.    % EM_SETTEXTEX=1121
  16.    % ST_UNICODE=8
  17.    % CP_ACP=0
  18.    % CP_UTF8=65001 'codepage UTF8
  19.   % WM_CLEAR=771  
  20.  
  21.    type SETTEXTEX
  22.      dword flags
  23.      uint  codepage
  24.    end type
  25.    
  26.  
  27.    extern lib "Kernel32.dll"
  28.    ! GetCommandLine         "GetCommandLineW"      '0
  29.   ! GetModuleHandle        "GetModuleHandleW"     '1
  30.   end extern
  31.  
  32.    extern lib "User32.dll"
  33.    ! CreateWindowEx        "CreateWindowExW"       '12
  34.   ! DefWindowProc         "DefWindowProcW"        '4
  35.   ! DispatchMessage       "DispatchMessageW"      '1
  36.   ! GetClientRect                                 '2
  37.   ! GetMessage            "GetMessageW"           '4
  38.   ! GetSystemMetrics                              '1
  39.   ! IsWindowUnicode
  40.    ! LoadIcon              "LoadIconW"             '2
  41.   ! LoadCursor            "LoadCursorW"           '2
  42.   ! MessageBox            "MessageBoxW"           '1
  43.   ! PostQuitMessage                               '1
  44.   ! RegisterClass         "RegisterClassW"        '1                                
  45.   ! SendMessage           "SendMessageW"          '4
  46.   ! SetFocus
  47.    ! SetWindowPos                                  '7
  48.   ! SetWindowText         "SetWindowTextW"        '2  
  49.   ! ShowWindow                                    '2
  50.   ! TranslateMessage                              '1
  51.   ! UpdateWindow                                  '1
  52.   end extern
  53.  
  54.    extern lib "Comctl32.dll"
  55.    ! InitCommonControlsEx                          '1
  56.   end extern
  57.  
  58.    #ifndef mode64bit
  59.      extern lib "Msvcrt.dll" cdecl
  60.    #else
  61.      extern lib "Msvcrt.dll"
  62.    #endif
  63.    ! _wfopen
  64.    ! fclose
  65.    ! fread
  66.    ! fseek
  67.    ! ftell
  68.    end extern
  69.  
  70.    function wGetFile(wstring name) as wstring
  71.    ==========================================
  72.    sys f
  73.    int e
  74.    wstring m="rb"
  75.    bstring2 s
  76.    f=_wfopen name,m      'open for reading binary
  77.   if f=0 then return ""  
  78.    fseek f,0,2           'end of file
  79.   e=ftell f             'get position
  80.   fseek f,0,0           'beginning of file
  81.   strptr s=getmemory e  'create buffer to fit
  82.   fread s,1,e,f         'load buffer
  83.   fclose f              'close file
  84.   return s
  85.    end function
  86.  
  87.    function wGetFile(wstring name) as string
  88.    ==========================================
  89.    sys f
  90.    int e
  91.    wstring m="rb"
  92.    bstring s
  93.    f=_wfopen name,m      'open for reading binary
  94.   if f=0 then return ""
  95.    fseek f,0,2           'end of file
  96.   e=ftell f             'get position
  97.   fseek f,0,0           'beginning of file
  98.   strptr s=getmemory e  'create buffer to fit
  99.   fread s,1,e,f         'load buffer
  100.   fclose f              'close file
  101.   return s
  102.    end function
  103.  
  104.    'create a structure of INITCOMMONCONTROLSEX
  105.   INITCOMMONCONTROLSEXt iccex
  106.    
  107.    iccex.dwSize=sizeof(iccex)
  108.    'Register Common Controls
  109.   iccex.dwICC= 0xffff
  110.    InitCommonControlsEx(@iccex)
  111.      
  112.    LoadLibrary("RICHED20.DLL")
  113.  
  114.  
  115.    declare function WinMain(sys inst, prevInst, asciiz2*cmdline, sys show) as sys
  116.  
  117.    '=========
  118.   'MAIN CODE
  119.   '=========
  120.  
  121.    dim cmdline as wchar ptr, hInstance as sys
  122.    @cmdline=GetCommandLine
  123.    hInstance=GetModuleHandle 0
  124.  
  125.    sys hButton1, hButton2, hREdit, hEdit, Group
  126.    % ID_BUTTON1=1000
  127.    % ID_BUTTON2=1001
  128.    sys lbl[5]
  129.    wstring lblTxt[4]={"   Ansi","UTF-16","UTF-8","Text","Open file as: "}
  130.    sys radio[4]
  131.    sys id_radio[4]={101,102,103,104}
  132.  
  133.  
  134.    'WINDOWS START
  135.   '=============
  136.  
  137.    WinMain hInstance,0,cmdline,SW_NORMAL
  138.    end
  139.  
  140.  
  141. function WinMain(sys inst, prevInst, asciiz2*cmdline, sys show) as sys
  142.  
  143.    WndClass wc
  144.    MSG      wm
  145.  
  146.    sys hwnd, Wwd, Wht, Wtx, Wty, Tax
  147.    wstring classname="REDemo"
  148.  
  149.    wc.style = CS_HREDRAW or CS_VREDRAW
  150.    wc.lpfnWndProc = @WndProc
  151.    wc.cbClsExtra =0
  152.    wc.cbWndExtra =0
  153.    wc.hInstance =inst
  154.    wc.hIcon=LoadIcon 0, IDI_APPLICATION
  155.    wc.hCursor=LoadCursor 0,IDC_ARROW
  156.    wc.hbrBackground = COLOR_WINDOW
  157.    wc.lpszMenuName =null
  158.    wc.lpszClassName = strptr classname
  159.  
  160.    RegisterClass (@wc)
  161.  
  162.    Wwd = 640 : Wht = 400
  163.    Tax = GetSystemMetrics SM_CXSCREEN
  164.    Wtx = (Tax - Wwd) /2
  165.    Tax = GetSystemMetrics SM_CYSCREEN
  166.    Wty = (Tax - Wht) /2
  167.  
  168.    'Main Window
  169.   hwnd = CreateWindowEx 0,wc.lpszClassName,  wstring("OXYGEN BASIC"),WS_OVERLAPPEDWINDOW,Wtx,Wty,Wwd,Wht,0,0,inst,0
  170.  
  171.    'Buttons
  172.   hButton1=CreateWindowEx(0,
  173.       wstring("Button"), wstring("Open File"),
  174.       WS_CHILD | WS_VISIBLE | BS_PUSHBUTTON,
  175.       500, 10, 80 ,25,
  176.       hwnd,ID_BUTTON1,inst, 0)
  177.    hButton2=CreateWindowEx(0,
  178.       wstring("Button"), wstring("Clear REdit"),
  179.       WS_CHILD | WS_VISIBLE | BS_PUSHBUTTON,
  180.       500, 45, 80 ,25,
  181.       hwnd,ID_BUTTON2,inst, 0)
  182.  
  183.    'Edit
  184.   hEdit=CreateWindowEx(WS_EX_CLIENTEDGE,
  185.       wstring("Edit"), wstring(""),
  186.       WS_CHILD | WS_VISIBLE | ES_LEFT | WS_BORDER | ES_AUTOHSCROLL,
  187.       20, 10, 450 ,25,
  188.       hwnd,0,inst,0)
  189.    
  190.    'Labels
  191.   int x, lft, wid
  192.    lft=130 : wid=50
  193.    for x=1 to 5
  194.       if x=5 then lft=-340 : wid=90
  195.       lbl[x]=CreateWindowEx(0,
  196.       wstring("Static"), wstring(""),
  197.       WS_CHILD | WS_VISIBLE | SS_LEFT,
  198.       lft+((x-1)*90), 45, wid ,25,
  199.       hwnd,0,inst,0)
  200.       SetWindowText(lbl[x],lblTxt[x])
  201.    next x
  202.  
  203.    'Radios
  204.   for x=1 to 4
  205.       radio[x]=CreateWindowEx(0,
  206.       wstring("Button"), wstring(""),
  207.       WS_CHILD | WS_VISIBLE | BS_AUTORADIOBUTTON,
  208.       180+((x-1)*90), 45, 15 ,15,
  209.       hwnd,id_radio[x],inst,0)
  210.    next x
  211.    SendMessage(radio[1], BM_SETCHECK, true,0)
  212.    
  213.    'Richedit
  214.   hREdit=CreateWindowEx(WS_EX_CLIENTEDGE,
  215.       wstring("RichEdit20W"), wstring(""),
  216.       WS_CHILDWINDOW|WS_VISIBLE | WS_VSCROLL | WS_HSCROLL | ES_AUTOVSCROLL | ES_SAVESEL | ES_MULTILINE | WS_BORDER | ES_WANTRETURN,
  217.       20, 60,200,25,
  218.       hwnd,0,inst,0)
  219.  
  220.    SetFocus(hEdit)
  221.    ShowWindow hwnd,SW_SHOW
  222.    UpdateWindow hwnd
  223.  
  224.    sys bRet
  225.    do while bRet := GetMessage (@wm, 0, 0, 0)
  226.      if bRet = -1 then
  227.        'show an error message
  228.     else
  229.        TranslateMessage @wm
  230.        DispatchMessage @wm
  231.      end if
  232.    wend
  233.  
  234. end function
  235.  
  236.  
  237. function WndProc (sys hWnd, uint wMsg, sys wParam, lparam ) as sys callback
  238.    SETTEXTEX settext
  239.  
  240.    select wMsg
  241.  
  242.       case WM_COMMAND
  243.         select loword(wParam)        
  244.            case ID_BUTTON1
  245.               wstring fName = space 256
  246.               SendMessage(hEdit,  WM_GETTEXT, len(fName), fName)
  247.               fname=ltrim(rtrim(fname))              
  248.               if len(fname)>0 then
  249.                 if SendMessage(Radio[1],BM_GETCHECK,0,0)=BST_CHECKED then 'Ansi
  250.                  string text1
  251.                   text1 = wgetfile (fName)                              
  252.                   if len(text1)=0 then
  253.                     MessageBox(hWnd, wstring(fName) + " missing or load failure!", wstring("Load File"), MB_OK or MB_ICONASTERISK)
  254.                   else                
  255.                     'RegisterClassW creates Unicode window, so translate to Unicode
  256.                    settext.flags=ST_UNICODE
  257.                     settext.codepage=CP_ACP  'System codepage
  258.                    SendMessage(hREdit, EM_SETTEXTEX, &settext, text1)
  259.                   end if
  260.                 end if  
  261.  
  262.                 if SendMessage(Radio[2],BM_GETCHECK,0,0)=BST_CHECKED then 'Unicode
  263.                  wstring text2
  264.                   text2 = (wstring) wgetfile (fName)                                
  265.                   if len(text2)=0 then
  266.                     MessageBox(hWnd, wstring(fName) + " load failure!", wstring("Load File"), MB_OK or MB_ICONASTERISK)
  267.                   else
  268.                     SendMessage(hREdit, WM_SETTEXT, 0, text2)
  269.                   end if
  270.                 end if  
  271.  
  272.                 if SendMessage(Radio[3],BM_GETCHECK,0,0)=BST_CHECKED then 'UTF8
  273.                  string text3
  274.                   wstring text3 = (wstring) wgetfile (fName)                              
  275.                   if len(text3)=0 then
  276.                     MessageBox(hWnd, wstring(fName) + " missing or load failure!", wstring("Load File"), MB_OK or MB_ICONASTERISK)
  277.                   else                
  278.                     'RegisterClassW creates Unicode window, so translate to Unicode
  279.                    settext.flags=ST_UNICODE
  280.                     settext.codepage=CP_UTF8  'cp 65001
  281.                    SendMessage(hREdit, EM_SETTEXTEX, &settext, text3)
  282.                   end if
  283.                 end if  
  284.  
  285.                 if SendMessage(Radio[4],BM_GETCHECK,0,0)=BST_CHECKED then 'Plain Text
  286.                  string text4
  287.                   text4 = wgetfile (fName)                              
  288.                   if len(text4)=0 then
  289.                     MessageBox(hWnd, wstring(fName) + " missing or load failure!", wstring("Load File"), MB_OK or MB_ICONASTERISK)
  290.                   else                
  291.                     'Workaround
  292.                    text4=chr(32) & text4
  293.                     SendMessage(hREdit, WM_SETTEXT, SF_TEXT, text4)
  294.                   end if
  295.                 end if  
  296.  
  297.               end if
  298.              
  299.            case ID_BUTTON2
  300.               SendMessage(hREdit, EM_SETSEL, 0, -1)
  301.               SendMessage(hREdit, WM_CLEAR, 0, 0)                              
  302.         end select
  303.  
  304.       case WM_SIZE      
  305.          RECT rcClient
  306.      
  307.          // Calculate remaining height and size edit
  308.          GetClientRect(hwnd, &rcClient)
  309.          SetWindowPos(hREdit, NULL, 0, rcClient.top+75, rcClient.right, rcClient.bottom-75, SWP_NOZORDER)
  310.            
  311.       case WM_DESTROY
  312.          PostQuitMessage 0
  313.  
  314.       case WM_KEYDOWN
  315.          Select wParam
  316.             Case 27 : SendMessage hwnd, WM_CLOSE, 0, 0      'ESCAPE
  317.         End Select
  318.  
  319.       case else
  320.          function=DefWindowProc hWnd,wMsg,wParam,lParam
  321.  
  322.    end select
  323.  
  324. end function ' WndProc
  325.  
Title: Re: Using OxygenBasic with Unicode
Post by: Arnold on July 04, 2018, 10:55:15 AM
When I started searching in Internet for unicode examples, by chance I found this site in José Roca's forum:
http://www.jose.it-berater.org/smfforum/index.php?topic=5253.msg22686#msg22686

I came to the same conclusion. Unicode is a world of its own. Therefore this would be the idea:

create a separate directory (OxygenUnicode?), copy oxygen.dll, co2.exe, \inc\rtl32.inc, \inc\rtl64.inc into the folder and sub folder, create folder \examples and start converting the examples which can be done using unicode. This will need some time. Nothing has to be changed with the original installation and there is no need for #ifdef unicode. But the differences between ansi and unicode could be elaborated much easier. The implementation of the include files in the \inc subfolder would mostly be unicode specific.

Would this approach make sense?
Title: Re: Using OxygenBasic with Unicode
Post by: JRS on July 04, 2018, 11:30:07 AM
I like thinBasic's approach with a core and extended with modules.
Title: Re: Using OxygenBasic with Unicode
Post by: Charles Pegge on July 04, 2018, 09:21:41 PM
Many thanks for your examples, Roland.

We have quite a good collection now. I'll think about how Unicode should be integrated with the rest of the system, for instance being used in combination with Opengl and maths, as well as Windows GUI.
Title: Re: Using OxygenBasic with Unicode
Post by: Arnold on July 05, 2018, 12:12:41 AM
Hi Charles,

my intention is not to muddle in Oxygenbasic in any way. It is clear for me that O2 can deal with unicode very well. Yet I found it is also a different way of programming.
Perhaps there are other options possible e.g.: for unicode specific include files ConsoleW.inc, MinWinW.inc, CoreWinW.inc etc. which use only the unicode functions, and an extension like .ubas/.oubas for the unicode programs. Probably there are some more options possible. But I think that the directive: #ifdef unicode which is used in some languages would increase the size of the source code and also make it a bit unclear.

Nevertheless UTF-16 (Windows) and UTF-8 (Linux, OSX) are encoding systems which are used worldwide. Without these systems some applications like translator apps would not be possible this comfortable way.

Roland
Title: Re: Using OxygenBasic with Unicode
Post by: José Roca on July 05, 2018, 07:43:25 AM
Duplicating procedures and include files is an utterly waste of time. Unicode fits all, so change the code to work with unicode and forget all that useless ansi stuff forever. Windows supports the "A" functions for backward compatibility only, but almost all the "A" functions are mere wrappers that convert the string parameters to unicode and call the "W" functions. I don't know why so many people are afraid of unicode. It does not bite.
Title: Re: Using OxygenBasic with Unicode
Post by: JRS on July 05, 2018, 10:35:37 AM
Would that lock O2 into Windows only? I would like to see a flag like base index option that would default strings to unicode or ansi.
Title: Re: Using OxygenBasic with Unicode
Post by: José Roca on July 05, 2018, 10:57:47 AM
If you haven't noticed it, it is already locked. It uses BSTRings and the Windows OLE engine. Even ansi strings are BSTRings allocated with SysAllocStringByteLen. Anyway, what was talking about was that using ansi instead of unicode with the Windows API is useless.
Title: Re: Using OxygenBasic with Unicode
Post by: Charles Pegge on July 05, 2018, 11:35:09 AM
Switching from ANSI to Unicode would not be difficult. But what concerns me is the schism created between physical keyboards, programming languages, symbols (dlls use ANSI) on one hand,  and the Unicode world on the other.


PS: We could carry our own bstring layer for other platforms.
Title: Re: Using OxygenBasic with Unicode
Post by: JRS on July 05, 2018, 01:29:08 PM
I personally have yet to do a project for a client where unicode was a requirement. I see multi-language support more prevalent in web programming.

Quote
Unicode contains a repertoire of over 137,000 characters covering 146 modern and historic scripts, as well as multiple symbol sets.

Quote
UTF-8 is a compromise character encoding that can be as compact as ASCII (if the file is just plain English text) but can also contain any unicode characters (with some increase in file size). UTF stands for Unicode Transformation Format. The '8' means it uses 8-bit blocks to represent a character.

Quote
PS: We could carry our own bstring layer for other platforms.

It would be great to see how far we could get running 64 bit O2 on Linux with header changes and tweaks.

Does ASM source code support Unicode text?
Title: Re: Using OxygenBasic with Unicode
Post by: Charles Pegge on July 07, 2018, 06:37:36 AM
Hi John,

ASM knows nothing about strings, only bytes, words etc. but o2 string literals are currently treated as 8bit ASCII.

o2 scripts are 8bit by default, but an o2 compiler could pass UTF16 scripts after specifying o2_mode 2 (or 10 for bstrings). This would convert down to UTF8 internally.
Title: Re: Using OxygenBasic with Unicode
Post by: Charles Pegge on July 07, 2018, 09:31:47 AM
Oops!, sorry I clipped your message, John. Yes, I tend to agree,  UTF8 is going to remain the industry standard, until Unicode virtual keyboards become the norm. It may happen!

The ideal unicode symbol for a do loop would be a bicycle :)
0x1F6B2
Title: Re: Using OxygenBasic with Unicode
Post by: JRS on July 07, 2018, 11:43:37 AM
It would be fun to create a UTF8 symbol gcc preprossor like C BASIC as a C front end language.

Curious. How does upper/lower case functions work outside the ANSI / ASCII range?
Title: Re: Using OxygenBasic with Unicode
Post by: Charles Pegge on July 08, 2018, 06:05:16 PM
For those scripts which have upper/lower case..

Greek/Coptic uses a +32 displacement, like Ascii:

https://en.wikipedia.org/wiki/Greek_and_Coptic
Title: Re: Using OxygenBasic with Unicode
Post by: José Roca on July 08, 2018, 06:22:59 PM
"I don't object to foreigners speaking a foreign language. I just wish they'd all speak the same foreign language." - From Billy Wilder's movie "Avanti".
Title: Re: Using OxygenBasic with Unicode
Post by: JRS on July 08, 2018, 06:45:30 PM
The computer revolution helped English be an international language.
Title: Re: Using OxygenBasic with Unicode
Post by: José Roca on July 08, 2018, 06:51:53 PM
Indeed. The manuals were so badly translated that it was impossible to understand them, so it was necessary to learn English to read the originals.

But these pesky foreigners still insist in speaking and writing in their foreign languages. That's why unicode is needed.
Title: Re: Using OxygenBasic with Unicode
Post by: JRS on July 08, 2018, 06:58:31 PM
My crystal ball says English will become the earth's native language and all others will be secondary languages kept alive by tradition.

China is going to be the hardest nut to crack.

Cost alone will be the driving factor.
Title: Re: Using OxygenBasic with Unicode
Post by: José Roca on July 08, 2018, 07:27:02 PM
Your crystal ball is broken. English is the most used secondary language, but there are more native speakers in Chinese and Spanish that in English. The US has now 41 million native Spanish speakers plus a further 11.6 million who are bilingual. More than in Spain!

Title: Re: Using OxygenBasic with Unicode
Post by: JRS on July 08, 2018, 07:31:10 PM
I wonder how much traffic non-English sites get compared to English speaking sites.

Real money is done In English.

@José - when did you learn English?
Quote
The number of non-English pages is rapidly expanding. The use of English online increased by around 281 percent from 2001 to 2011, a lower rate of growth than that of Spanish (743 percent), Chinese (1,277 percent), Russian (1,826 percent) or Arabic (2,501 percent) over the same period.
Title: Re: Using OxygenBasic with Unicode
Post by: Mike Lobanovsky on July 08, 2018, 10:33:39 PM
Real money is done In English.

No John.

The total of Asian and European billionaires is roughly twice bigger (https://en.wikipedia.org/wiki/List_of_countries_by_the_number_of_billionaires) than North America's total.

So, real money is made primarily in Chinese, Arab, Russian, and Old World tongues, i.e. comes from the territories where the English language is not spoken or even understood. There are indeed very few Frenchmen or Germans or Russians, to say nothing of Chenese, who you can communicate in English with.

Note that most of the North American money also comes from overseas.

North America is a great place to live in and a good market to sell on but it offers only a limited list of goods, commodities and services that the world is eager to buy from. ;)
Title: Re: Using OxygenBasic with Unicode
Post by: José Roca on July 09, 2018, 07:27:12 AM
I have learned English mainly reading the press and Microsoft documentation. Therefore, my grammar is deficient and my pronunciation awful. When I was a student, they teached French instead of English in the schools of my country. I can read in seven languages, but to speak them well you need to practice it very much. I only speak well my mother language (Valencian) and Spanish.

Title: Re: Using OxygenBasic with Unicode
Post by: JRS on July 09, 2018, 09:13:59 AM
You're more amazing than I have given you credit for to date.