Author Topic: DLLC  (Read 28675 times)

0 Members and 1 Guest are viewing this topic.

JRS

  • Guest
Re: DLLC
« Reply #75 on: November 07, 2014, 07:09:50 PM »
Charles,

I was thinking maybe something like this for a FORMAT string would be more SB friendly.

Code: Script BASIC
  1. fmtstr = "L,L,C,L,R~-###,###.00~,R"
  2.  

The text alignment is independent of the numeric formating if applied. Integers may not require formatting and justification only. SB uses a "%~-###,###.00~" mask format as an alternate to the standard "%d" C style.

Quote from: SB docs
An alternate format BASIC-like for numbers has the form %~format~ where format can be:

# Digit or space

0 Digit or zero

^ Stores a number in exponential format. Unlike QB's USING format this is a place-holder like the #.

. The position of the decimal point.

, Separator.

- Stores minus if the number is negative.

+ Stores the sign of the number.
« Last Edit: November 08, 2014, 10:57:49 AM by John »

Mike Lobanovsky

  • Guest
Re: DLLC
« Reply #76 on: November 08, 2014, 01:59:48 AM »
Gentlemen,

In my humble opinion there is nothing better than a standard C language printf() primitive. It is present on every platform in every implementation of a C runtime that's there as a system library. An average implementation of printf() function is more than 500 lines of C code long.

Do you really think it's worth trying to beat the stake?

(P.S. When answering, please consider that we're talking about a language feature here rather that a simple know-how tip)

Charles Pegge

  • Guest
Re: DLLC
« Reply #77 on: November 08, 2014, 03:55:51 AM »
Hi Aurel,

Quote
Is there function like Parse(delimiter,text) or in another words is there a easy way to modify
function  GetTextWord  to function buffer = Parse(delimiter,text)

I use several specialised word parsers in the o2 compiler. Their behaviour is set for specific tasks, but they all have the same params: string s, i byref.

The most important one for raw Basic is tword(s,i)

They also set various global state variables

ascw : ascii of first character
ascn : ascii of next word / end of line
swd  : start position
lenw : length of word


i       : (byref) position of next word / end of line

complete set of parsing functions (FB)

src/o2lexi.bas
« Last Edit: November 08, 2014, 04:23:11 AM by Charles Pegge »

Charles Pegge

  • Guest
Re: DLLC
« Reply #78 on: November 08, 2014, 04:08:34 AM »
Hi John,

I'll focus on the text-table processing aspect for now. It will be a very useful generic for databases, as well as text tabulation.

There will be functions for extracting rows and columns which will provide useful outputs, compatible with the SB splitter / array-maker.

Charles Pegge

  • Guest
Re: DLLC
« Reply #79 on: November 08, 2014, 04:19:08 AM »

Hi Mike,

Yes, we can pass the whole printf, or rather, sprintf stuff onto MSVCRT :)

For Basic number formats, I quite like 0 and ##.## placeholder notation, but in a utility function rather than core o2.

JRS

  • Guest
Re: DLLC
« Reply #80 on: November 08, 2014, 09:49:53 AM »
Sounds good. SB's splitter/array maker works well gluing together diverse matrix structures. (index, associative or a combination of both) Thanks for the update!

I'm going to be digging into Dave's SB array handling code to get a better handle how it works. Hopefully all O2 will have to do is the sort part.

This will also enhance the TinyScheme SB interface dealing with returned Lisp lists strings to be processed further.

Just thinking a None alignment option/place holder might be more flexible.

OR

,, = None or Default action taken.

Code: Script BASIC
  1. fmtstr = "L,,,R~-##.00~,R~-##,###.00~,R"
  2.  

Quote
Hide messages posted by members on my ignore list.

Ignore list, nice!
« Last Edit: November 08, 2014, 05:21:25 PM by John »

JRS

  • Guest
Re: DLLC
« Reply #81 on: November 08, 2014, 06:21:26 PM »
Quote
n padding spaces between columns

If this parameter is 0 (zero) then could you use a TAB character? This would be handy as my SPLIT delimiter.

JRS

  • Guest
Re: DLLC
« Reply #82 on: November 08, 2014, 11:44:44 PM »
Charles,

Here is a quick prototype in SB of the direction I think your going. Can you confirm?

Code: Script BASIC
  1.  
  2. FUNCTION FormatLine(ln)
  3.   SPLITA ln BY dlm TO col
  4.   FOR x = 0 to UBOUND(col)
  5.     IF fmt[x,0] = "L" THEN
  6.       tmp = LEFT(col[x] & STRING(fmt[x,1]," "),fmt[x,1])
  7.       IF spc = 0 THEN
  8.         tmp &= "\t"
  9.       ELSE
  10.         tmp &= STRING(spc," ")
  11.       END IF
  12.       rs &= tmp
  13.     ELSE IF fmt[x,0] = "R" THEN  
  14.       tmp = RIGHT(STRING(fmt[x,2]," ") & FORMAT(fmt[x,2],col[x]), fmt[x,1])
  15.       IF spc = 0 THEN
  16.         tmp &= "\t"
  17.       ELSE
  18.         tmp &= STRING(spc," ")
  19.       END IF
  20.       rs &= tmp
  21.     END IF
  22.   NEXT
  23.   FormatLine = rs
  24. END FUNCTION            
  25.  
  26. spc = 1
  27. dlm = ","
  28. rs = ""
  29.  
  30. fmt[0,0]="L"
  31. fmt[0,1]=10
  32. fmt[1,0]="R"
  33. fmt[1,1]=10
  34. fmt[1,2]="%~$-###.00~"
  35.  
  36. PRINT FormatLine("John,1.5"),"\n"
  37.  


jrs@laptop:~/sb/sb22/test$ scriba fmtline.sb
John       $   1.50
jrs@laptop:~/sb/sb22/test$



« Last Edit: November 09, 2014, 08:42:22 AM by John »

Charles Pegge

  • Guest
Re: DLLC
« Reply #83 on: November 09, 2014, 01:13:46 AM »
Hi John,

Yes Similar, but column maximums are collected first. then overridden by any specified widths.

I think padding might be better expressed as a string. This will satisfy the requirements of both data layouts and displayed text layouts. For instance, using vertical bars between columns:

" | "

JRS

  • Guest
Re: DLLC
« Reply #84 on: November 09, 2014, 08:39:19 AM »
The | (vertical bar) sounds like a good delimiter for the format string.

I'll cleanup my by the line method with column widths as a requirement and using my format array. I also plan to make this a comma-separated values (CSV) input string only. ("A String",123,...")

FYI - SB has a secrete weapon (undocumented) called SPLITAQ. (for spliting CSV strings)  :)

Quote from: SB source
SPLITAQ

SPLITAQ string BY string QUOTE string TO array

Split a string into an array using the second string as delimiter.
The delimited fields may optionally be quoted with the third string.
If the string to be split has zero length the array becomes undefined.
When the delimiter is a zero length string each array element will contain a
single character of the string.

Leading and trailing delimiters are accepted and return an empty element
in the array. For example :-

Code: Script BASIC
  1.    SPLITAQ ",'A,B',C," BY "," QUOTE "'" TO Result
  2.  
   will generate
Code: Script BASIC
  1.                  Result[0] = ""
  2.                  Result[1] = "A,B"
  3.                  Result[2] = "C"
  4.                  Result[3] = ""
  5.  

Note that this kind of handling of trailing and leading empty elements is different
from the handling of the same by the command SPLIT and SPLITA which do ignore
those empty elements. This command is useful to handle lines exported as CSV from
Excel or similar application.

The QUOTE string is really a string and need not be a single character. If there is an
unmatched quote string in the string to be split then the rest of the string until its end
is considered quoted.
« Last Edit: November 09, 2014, 12:09:59 PM by John »

JRS

  • Guest
T.bas - CSV Line Formatter
« Reply #85 on: November 09, 2014, 01:20:56 PM »
Here is my CSV line formatter. It should be self explanatory. I'm going to add this to the existing T.bas (Tools) extension module include file.

Quote from: 4 Mike
As an alternative to the BASIC like format mask, you are free to use the SB printf style format mask.

%[flags][width][.precision]type  type = can only be "dioxXueEfgGsc".

Code: Script BASIC
  1. ' result = FormatLine(in_str, fmt_str, quo_char, num_spc) Note: num_spc = -1 uses TAB
  2.  
  3. FUNCTION FormatLine(ln,fmtstr,qc,nsp)
  4.   SPLITAQ ln BY "," QUOTE qc TO col
  5.   SPLITA fmtstr BY "|" TO fmtcmd
  6.   rs = ""
  7.   FOR x = 0 to UBOUND(col)
  8.     SPLITA fmtcmd[x] BY ":" TO fmt
  9.     IF fmt[0] = "L" THEN
  10.       tmp = LEFT(col[x] & STRING(fmt[1]," "),fmt[1])
  11.       GOSUB Margin
  12.     ELSE IF fmt[0] = "R" THEN
  13.       IF fmt[2] <> undef THEN
  14.         tmp = FORMAT(fmt[2],col[x])
  15.       ELSE
  16.         tmp = col[x]
  17.       END IF
  18.       tmp = RIGHT(STRING(fmt[1]," ") & tmp, fmt[1])
  19.       GOSUB Margin
  20.     ELSE IF fmt[0] = "C" THEN
  21.       pad = fmt[1] - LEN(col[x])
  22.       pboth = pad \ 2
  23.       prt = pad % 2
  24.       tmp = STRING(pboth," ") & col[x] & STRING(pboth," ") & STRING(prt," ")
  25.       GOSUB Margin
  26.     END IF
  27.   NEXT
  28.   GOTO Done
  29.  
  30.   Margin:
  31.   IF nsp = -1 THEN
  32.     tmp &= "\t"
  33.   ELSE
  34.     tmp &= STRING(nsp," ")
  35.   END IF
  36.   rs &= tmp  
  37.   RETURN
  38.  
  39.   Done:
  40.   FormatLine = rs
  41. END FUNCTION
  42.  
  43. amt = "|C:6|R:10:%~#,###.00~"
  44. fmtstr = "L:20|R:5" & amt & amt & amt & amt
  45. PRINT FormatLine("\"John Spikowski\",123,30,10.5,60,20.75,90,35.25,120,1234.99",fmtstr,"\"",0),"\n"
  46.  


jrs@laptop:~/sb/sb22/test$ time scriba fmtline.sb
John Spikowski      123  30     10.50  60     20.75  90     35.25 120  1,234.99

real   0m0.008s
user   0m0.008s
sys    0m0.000s
jrs@laptop:~/sb/sb22/test$


If using nsp = -1 (TAB character)

John Spikowski         123     30        10.50     60        20.75     90        35.25    120     1,234.99


FYI

If you wish to Skip a column, you can do the following. (or just about anything else besides L,C,R) || also works.

Code: Script BASIC
  1. fmtstr = "L:20|S|C:6|R:10:%~#,###.00~|C:6|R:10:%~#,###.00~|C:6|R:10:%~#,###.00~|C:6|R:10:%~#,###.00~"
  2.  

John Spikowski        30     10.50  60     20.75  90     35.25 120  1,234.99



« Last Edit: November 10, 2014, 08:03:08 PM by John »

JRS

  • Guest
Re: T.bas - CSV Line Formatter
« Reply #86 on: November 09, 2014, 08:47:55 PM »
While searching for some .csv sample data to play with, I ran into this data set. It had the right amount of columns and rows.



Code: Script BASIC
  1. ' result = FormatLine(in_str, fmt_str, quo_char, num_spc) Note: num_spc = -1 uses TAB
  2.  
  3. FUNCTION FormatLine(ln,fmtstr,qc,nsp)
  4.   SPLITAQ ln BY "," QUOTE qc TO col
  5.   SPLITA fmtstr BY "|" TO fmtcmd
  6.   rs = ""
  7.   FOR x = 0 to UBOUND(col)
  8.     SPLITA fmtcmd[x] BY ":" TO fmt
  9.     IF fmt[0] = "L" THEN
  10.       tmp = LEFT(col[x] & STRING(fmt[1]," "),fmt[1])
  11.       GOSUB Margin
  12.     ELSE IF fmt[0] = "R" THEN
  13.       IF fmt[2] <> undef THEN
  14.         tmp = FORMAT(fmt[2],col[x])
  15.       ELSE
  16.         tmp = col[x]
  17.       END IF
  18.       tmp = RIGHT(STRING(fmt[1]," ") & tmp, fmt[1])
  19.       GOSUB Margin
  20.     ELSE IF fmt[0] = "C" THEN
  21.       pad = fmt[1] - LEN(col[x])
  22.       pboth = pad \ 2
  23.       prt = pad % 2
  24.       tmp = STRING(pboth," ") & col[x] & STRING(pboth," ") & STRING(prt," ")
  25.       GOSUB Margin
  26.     END IF
  27.   NEXT
  28.   GOTO Done
  29.  
  30.   Margin:
  31.   IF nsp = -1 THEN
  32.     tmp &= "\t"
  33.   ELSE
  34.     tmp &= STRING(nsp," ")
  35.   END IF
  36.   rs &= tmp  
  37.   RETURN
  38.  
  39.   Done:
  40.   FormatLine = rs
  41. END FUNCTION
  42.  
  43. OPEN "SacramentocrimeJanuary2006.csv" FOR INPUT AS #1
  44. OPEN "sac.fmt" FOR OUTPUT AS #2
  45. fmtstr = "L:15|L:30|R:4|L:4|R:6|L:35|L:6|R:10:%~-##0.0000~|R:10:%~-##0.0000~"
  46. LINE INPUT #1, hdr
  47. WHILE NOT EOF(1)
  48.   LINE INPUT #1, csvln
  49.   csvln = CHOMP(csvln)
  50.   PRINT #2, FormatLine(csvln,fmtstr,"",2),"\n"
  51. WEND  
  52.  
  53. CLOSE(1)
  54. CLOSE(2)
  55.  

Output (7584 rows)

jrs@laptop:~/sb/sb22/test$ time scriba fmtline.sb

real   0m0.454s
user   0m0.415s
sys    0m0.036s
jrs@laptop:~/sb/sb22/test$

Code: [Select]
1/1/06 0:00      3108 OCCIDENTAL DR                 3  3C      1115  10851(A)VC TAKE VEH W/O OWNER        2404       38.5504   -121.3914 
1/1/06 0:00      2082 EXPEDITION WAY                5  5A      1512  459 PC  BURGLARY RESIDENCE           2204       38.4735   -121.4902 
1/1/06 0:00      4 PALEN CT                         2  2A       212  10851(A)VC TAKE VEH W/O OWNER        2404       38.6578   -121.4621 
1/1/06 0:00      22 BECKFORD CT                     6  6C      1443  476 PC PASS FICTICIOUS CHECK         2501       38.5068   -121.4270 
1/1/06 0:00      3421 AUBURN BLVD                   2  2A       508  459 PC  BURGLARY-UNSPECIFIED         2299       38.6374   -121.3846 
« Last Edit: November 15, 2015, 08:06:27 PM by John »

Charles Pegge

  • Guest
Re: DLLC
« Reply #87 on: November 11, 2014, 04:32:44 AM »
Hi John,

I've added sort-by-column and CSV support to the o2 table_utils. Also using MSVCRT sprintf to format numerics, if a format string is defined per column.

To read/write CSV, the delimiter is simply set to "," including the double quotes.

To get printf formats, the format specifier is included between quotes: '%f'

Thus a format string might look like this

" L 20  L 20 C R 20 '%e'  R 20 '%f' "

I'm still testing...

JRS

  • Guest
Re: DLLC
« Reply #88 on: November 11, 2014, 08:12:07 AM »
Sort-by-column, cool! The standard C format masks should put a smile on Mike face. Looking forward to giving it a try.


JRS

  • Guest
T.bas CSV to SQLite3
« Reply #89 on: November 11, 2014, 11:07:03 PM »
Charles,

Here is a prototype of my CSV2SQL to be function. I almost gave up on it because it was so slow until I discovered TRANSACTION.

Quote
By default SQLite will evaluate every INSERT / UPDATE statement within a unique transaction. If performing a large number of inserts, it's advisable to wrap your operation in a transaction:

Code: Script BASIC
  1. IMPORT sqlite.bas
  2.  
  3. OPEN "SacramentocrimeJanuary2006.csv" FOR INPUT AS #1
  4. db = sqlite::open("sac116.db")
  5. fmtstr = "SSISISIRR"
  6. LINE INPUT #1, hdr
  7. hdr = CHOMP(hdr)
  8. SPLITA hdr BY "," TO col
  9. SPLITA fmtstr BY "" TO typ
  10. lastcol = UBOUND(col)
  11. sql = "CREATE TABLE crime ("
  12. FOR x = 0 TO lastcol
  13.   tmp = ""
  14.   IF typ[x] = "S" THEN
  15.     tstr = " TEXT"
  16.   ELSE IF typ[x] = "I" THEN
  17.     tstr = " INTEGER"
  18.   ELSE IF typ[x] = "R" THEN
  19.     tstr = " REAL"
  20.   END IF
  21.   tmp &= col[x] & tstr
  22.   IF x <> lastcol THEN tmp &= ", "
  23.   sql &= tmp
  24. NEXT
  25. sql &= ");"
  26. sqlite::execute(db, sql)
  27. sqlite::execute(db, "BEGIN TRANSACTION")
  28. WHILE NOT EOF(1)
  29.   sql = "INSERT INTO crime VALUES ("
  30.   LINE INPUT #1, csvln
  31.   csvln = CHOMP(csvln)
  32.   SPLITAQ csvln BY "," QUOTE "" TO col
  33.   FOR x = 0 TO lastcol
  34.     IF typ[x] = "S" THEN
  35.       tmp = "'" & col[x] & "'"
  36.     ELSE
  37.       tmp = col[x]
  38.     END IF
  39.     IF x <> lastcol THEN tmp &= ", "
  40.     sql &= tmp
  41.   NEXT
  42.   sql &= ");"
  43.   sqlite::execute(db, sql)
  44. WEND
  45. sqlite::execute(db, "END TRANSACTION")
  46. sqlite::close(db)
  47. CLOSE(1)
  48.  

Output

jrs@laptop:~/sb/sb22/test$ time scriba csv2sql.sb

real   0m0.763s
user   0m0.457s
sys   0m0.016s
jrs@laptop:~/sb/sb22/test$ sqlite3
SQLite version 3.8.2 2013-12-06 14:53:30
Enter ".help" for instructions
Enter SQL statements terminated with a ";"
sqlite> .open sac116.db
sqlite> SELECT COUNT(*) FROM crime;
7584
sqlite> .q
jrs@laptop:~/sb/sb22/test$

« Last Edit: November 12, 2014, 12:20:02 AM by John »