Author Topic: DLLC (Read 37258 times)

JRS · « **Reply #75 on:** November 07, 2014, 07:09:50 PM »

Charles,

I was thinking maybe something like this for a FORMAT string would be more SB friendly.

Code: Script BASIC

fmtstr = "L,L,C,L,R~-###,###.00~,R"
 

The text alignment is independent of the numeric formating if applied. Integers may not require formatting and justification only. SB uses a "%~-###,###.00~" mask format as an alternate to the standard "%d" C style.

Quote from: SB docs

An alternate format BASIC-like for numbers has the form %~format~ where format can be:

# Digit or space

0 Digit or zero

^ Stores a number in exponential format. Unlike QB's USING format this is a place-holder like the #.

. The position of the decimal point.

, Separator.

- Stores minus if the number is negative.

+ Stores the sign of the number.

Mike Lobanovsky · « **Reply #76 on:** November 08, 2014, 01:59:48 AM »

Gentlemen,

In my humble opinion there is nothing better than a standard C language printf() primitive. It is present on every platform in every implementation of a C runtime that's there as a system library. An average implementation of printf() function is more than 500 lines of C code long.

Do you really think it's worth trying to beat the stake?

(P.S. When answering, please consider that we're talking about a language feature here rather that a simple know-how tip)

Charles Pegge · « **Reply #77 on:** November 08, 2014, 03:55:51 AM »

Hi Aurel,

Quote

Is there function like Parse(delimiter,text) or in another words is there a easy way to modify
function GetTextWord to function buffer = Parse(delimiter,text)

I use several specialised word parsers in the o2 compiler. Their behaviour is set for specific tasks, but they all have the same params: string s, i byref.

The most important one for raw Basic is tword(s,i)

They also set various global state variables

ascw : ascii of first character
ascn : ascii of next word / end of line
swd : start position
lenw : length of word

i : (byref) position of next word / end of line

complete set of parsing functions (FB)

src/o2lexi.bas

Charles Pegge · « **Reply #78 on:** November 08, 2014, 04:08:34 AM »

Hi John,

I'll focus on the text-table processing aspect for now. It will be a very useful generic for databases, as well as text tabulation.

There will be functions for extracting rows and columns which will provide useful outputs, compatible with the SB splitter / array-maker.

Charles Pegge · « **Reply #79 on:** November 08, 2014, 04:19:08 AM »

Hi Mike,

Yes, we can pass the whole printf, or rather, sprintf stuff onto MSVCRT

For Basic number formats, I quite like 0 and ##.## placeholder notation, but in a utility function rather than core o2.

JRS · « **Reply #80 on:** November 08, 2014, 09:49:53 AM »

Sounds good. SB's splitter/array maker works well gluing together diverse matrix structures. (index, associative or a combination of both) Thanks for the update!

I'm going to be digging into Dave's SB array handling code to get a better handle how it works. Hopefully all O2 will have to do is the sort part.

This will also enhance the TinyScheme SB interface dealing with returned Lisp lists strings to be processed further.

Just thinking a None alignment option/place holder might be more flexible.

OR

,, = None or Default action taken.

Code: Script BASIC

fmtstr = "L,,,R~-##.00~,R~-##,###.00~,R"
 

Quote

Hide messages posted by members on my ignore list.

Ignore list, nice!

JRS · « **Reply #81 on:** November 08, 2014, 06:21:26 PM »

Quote

n padding spaces between columns

If this parameter is 0 (zero) then could you use a TAB character? This would be handy as my SPLIT delimiter.

JRS · « **Reply #82 on:** November 08, 2014, 11:44:44 PM »

Charles,

Here is a quick prototype in SB of the direction I think your going. Can you confirm?

Code: Script BASIC

 
FUNCTION FormatLine(ln)
  SPLITA ln BY dlm TO col
  FOR x = 0 to UBOUND(col)
    IF fmt[x,0] = "L" THEN
      tmp = LEFT(col[x] & STRING(fmt[x,1]," "),fmt[x,1])
      IF spc = 0 THEN 
        tmp &= "\t"
      ELSE
        tmp &= STRING(spc," ")
      END IF
      rs &= tmp
    ELSE IF fmt[x,0] = "R" THEN   
      tmp = RIGHT(STRING(fmt[x,2]," ") & FORMAT(fmt[x,2],col[x]), fmt[x,1])
      IF spc = 0 THEN 
        tmp &= "\t"
      ELSE
        tmp &= STRING(spc," ")
      END IF
      rs &= tmp
    END IF
  NEXT
  FormatLine = rs
END FUNCTION             
 
spc = 1
dlm = ","
rs = ""
 
fmt[0,0]="L"
fmt[0,1]=10
fmt[1,0]="R"
fmt[1,1]=10
fmt[1,2]="%~$-###.00~"
 
PRINT FormatLine("John,1.5"),"\n"
 

jrs@laptop:~/sb/sb22/test$ scriba fmtline.sb John $ 1.50 jrs@laptop:~/sb/sb22/test$

Charles Pegge · « **Reply #83 on:** November 09, 2014, 01:13:46 AM »

Hi John,

Yes Similar, but column maximums are collected first. then overridden by any specified widths.

I think padding might be better expressed as a string. This will satisfy the requirements of both data layouts and displayed text layouts. For instance, using vertical bars between columns:

" | "

JRS · « **Reply #84 on:** November 09, 2014, 08:39:19 AM »

The | (vertical bar) sounds like a good delimiter for the format string.

I'll cleanup my by the line method with column widths as a requirement and using my format array. I also plan to make this a comma-separated values (CSV) input string only. ("A String",123,...")

FYI - SB has a secrete weapon (undocumented) called SPLITAQ. (for spliting CSV strings)

Quote from: SB source

SPLITAQ

SPLITAQ string BY string QUOTE string TO array

Split a string into an array using the second string as delimiter.
The delimited fields may optionally be quoted with the third string.
If the string to be split has zero length the array becomes undefined.
When the delimiter is a zero length string each array element will contain a
single character of the string.

Leading and trailing delimiters are accepted and return an empty element
in the array. For example :-

Code: Script BASIC
SPLITAQ ",'A,B',C," BY "," QUOTE "'" TO Result

will generate
Code: Script BASIC
Result[0] = ""
Result[1] = "A,B"
Result[2] = "C"
Result[3] = ""

Note that this kind of handling of trailing and leading empty elements is different
from the handling of the same by the command SPLIT and SPLITA which do ignore
those empty elements. This command is useful to handle lines exported as CSV from
Excel or similar application.

The QUOTE string is really a string and need not be a single character. If there is an
unmatched quote string in the string to be split then the rest of the string until its end
is considered quoted.

JRS · « **Reply #85 on:** November 09, 2014, 01:20:56 PM »

Here is my CSV line formatter. It should be self explanatory. I'm going to add this to the existing T.bas (Tools) extension module include file.

Quote from: 4 Mike

As an alternative to the BASIC like format mask, you are free to use the SB printf style format mask.

%[flags][width][.precision]type type = can only be "dioxXueEfgGsc".

Code: Script BASIC

' result = FormatLine(in_str, fmt_str, quo_char, num_spc) Note: num_spc = -1 uses TAB
 
FUNCTION FormatLine(ln,fmtstr,qc,nsp)
  SPLITAQ ln BY "," QUOTE qc TO col
  SPLITA fmtstr BY "|" TO fmtcmd
  rs = ""
  FOR x = 0 to UBOUND(col)
    SPLITA fmtcmd[x] BY ":" TO fmt
    IF fmt[0] = "L" THEN
      tmp = LEFT(col[x] & STRING(fmt[1]," "),fmt[1])
      GOSUB Margin
    ELSE IF fmt[0] = "R" THEN
      IF fmt[2] <> undef THEN
        tmp = FORMAT(fmt[2],col[x])
      ELSE
        tmp = col[x]
      END IF
      tmp = RIGHT(STRING(fmt[1]," ") & tmp, fmt[1])
      GOSUB Margin
    ELSE IF fmt[0] = "C" THEN
      pad = fmt[1] - LEN(col[x])
      pboth = pad \ 2
      prt = pad % 2
      tmp = STRING(pboth," ") & col[x] & STRING(pboth," ") & STRING(prt," ")
      GOSUB Margin
    END IF
  NEXT
  GOTO Done
 
  Margin:
  IF nsp = -1 THEN
    tmp &= "\t"
  ELSE
    tmp &= STRING(nsp," ")
  END IF
  rs &= tmp  
  RETURN
  
  Done:
  FormatLine = rs
END FUNCTION
 
amt = "|C:6|R:10:%~#,###.00~"
fmtstr = "L:20|R:5" & amt & amt & amt & amt 
PRINT FormatLine("\"John Spikowski\",123,30,10.5,60,20.75,90,35.25,120,1234.99",fmtstr,"\"",0),"\n"
 

jrs@laptop:~/sb/sb22/test$ time scriba fmtline.sb John Spikowski 123 30 10.50 60 20.75 90 35.25 120 1,234.99 real 0m0.008s user 0m0.008s sys 0m0.000s jrs@laptop:~/sb/sb22/test$

If using nsp = -1 (TAB character)
John Spikowski 123 30 10.50 60 20.75 90 35.25 120 1,234.99

FYI

If you wish to Skip a column, you can do the following. (or just about anything else besides L,C,R) || also works.

Code: Script BASIC

fmtstr = "L:20|S|C:6|R:10:%~#,###.00~|C:6|R:10:%~#,###.00~|C:6|R:10:%~#,###.00~|C:6|R:10:%~#,###.00~"
 

John Spikowski 30 10.50 60 20.75 90 35.25 120 1,234.99

JRS · « **Reply #86 on:** November 09, 2014, 08:47:55 PM »

While searching for some .csv sample data to play with, I ran into this data set. It had the right amount of columns and rows.

Code: Script BASIC

' result = FormatLine(in_str, fmt_str, quo_char, num_spc) Note: num_spc = -1 uses TAB
 
FUNCTION FormatLine(ln,fmtstr,qc,nsp)
  SPLITAQ ln BY "," QUOTE qc TO col
  SPLITA fmtstr BY "|" TO fmtcmd
  rs = ""
  FOR x = 0 to UBOUND(col)
    SPLITA fmtcmd[x] BY ":" TO fmt
    IF fmt[0] = "L" THEN
      tmp = LEFT(col[x] & STRING(fmt[1]," "),fmt[1])
      GOSUB Margin
    ELSE IF fmt[0] = "R" THEN
      IF fmt[2] <> undef THEN
        tmp = FORMAT(fmt[2],col[x])
      ELSE
        tmp = col[x]
      END IF
      tmp = RIGHT(STRING(fmt[1]," ") & tmp, fmt[1])
      GOSUB Margin
    ELSE IF fmt[0] = "C" THEN
      pad = fmt[1] - LEN(col[x])
      pboth = pad \ 2
      prt = pad % 2
      tmp = STRING(pboth," ") & col[x] & STRING(pboth," ") & STRING(prt," ")
      GOSUB Margin
    END IF
  NEXT
  GOTO Done
 
  Margin:
  IF nsp = -1 THEN
    tmp &= "\t"
  ELSE
    tmp &= STRING(nsp," ")
  END IF
  rs &= tmp  
  RETURN
  
  Done:
  FormatLine = rs
END FUNCTION
 
OPEN "SacramentocrimeJanuary2006.csv" FOR INPUT AS #1
OPEN "sac.fmt" FOR OUTPUT AS #2
fmtstr = "L:15|L:30|R:4|L:4|R:6|L:35|L:6|R:10:%~-##0.0000~|R:10:%~-##0.0000~"
LINE INPUT #1, hdr
WHILE NOT EOF(1)
  LINE INPUT #1, csvln
  csvln = CHOMP(csvln)
  PRINT #2, FormatLine(csvln,fmtstr,"",2),"\n"
WEND  
 
CLOSE(1)
CLOSE(2)
 

Output (7584 rows)
jrs@laptop:~/sb/sb22/test$ time scriba fmtline.sb real 0m0.454s user 0m0.415s sys 0m0.036s jrs@laptop:~/sb/sb22/test$

Code: [Select]

1/1/06 0:00      3108 OCCIDENTAL DR                 3  3C      1115  10851(A)VC TAKE VEH W/O OWNER        2404       38.5504   -121.3914  
1/1/06 0:00      2082 EXPEDITION WAY                5  5A      1512  459 PC  BURGLARY RESIDENCE           2204       38.4735   -121.4902  
1/1/06 0:00      4 PALEN CT                         2  2A       212  10851(A)VC TAKE VEH W/O OWNER        2404       38.6578   -121.4621  
1/1/06 0:00      22 BECKFORD CT                     6  6C      1443  476 PC PASS FICTICIOUS CHECK         2501       38.5068   -121.4270  
1/1/06 0:00      3421 AUBURN BLVD                   2  2A       508  459 PC  BURGLARY-UNSPECIFIED         2299       38.6374   -121.3846

Charles Pegge · « **Reply #87 on:** November 11, 2014, 04:32:44 AM »

Hi John,

I've added sort-by-column and CSV support to the o2 table_utils. Also using MSVCRT sprintf to format numerics, if a format string is defined per column.

To read/write CSV, the delimiter is simply set to "," including the double quotes.

To get printf formats, the format specifier is included between quotes: '%f'

Thus a format string might look like this

" L 20 L 20 C R 20 '%e' R 20 '%f' "

I'm still testing...

JRS · « **Reply #88 on:** November 11, 2014, 08:12:07 AM »

Sort-by-column, cool! The standard C format masks should put a smile on Mike face. Looking forward to giving it a try.

JRS · « **Reply #89 on:** November 11, 2014, 11:07:03 PM »

Charles,

Here is a prototype of my CSV2SQL to be function. I almost gave up on it because it was so slow until I discovered TRANSACTION.

Quote

By default SQLite will evaluate every INSERT / UPDATE statement within a unique transaction. If performing a large number of inserts, it's advisable to wrap your operation in a transaction:

Code: Script BASIC

IMPORT sqlite.bas
 
OPEN "SacramentocrimeJanuary2006.csv" FOR INPUT AS #1
db = sqlite::open("sac116.db")
fmtstr = "SSISISIRR"
LINE INPUT #1, hdr
hdr = CHOMP(hdr)
SPLITA hdr BY "," TO col
SPLITA fmtstr BY "" TO typ
lastcol = UBOUND(col)
sql = "CREATE TABLE crime ("
FOR x = 0 TO lastcol
  tmp = ""
  IF typ[x] = "S" THEN
    tstr = " TEXT"
  ELSE IF typ[x] = "I" THEN
    tstr = " INTEGER"
  ELSE IF typ[x] = "R" THEN
    tstr = " REAL"
  END IF
  tmp &= col[x] & tstr
  IF x <> lastcol THEN tmp &= ", "
  sql &= tmp
NEXT
sql &= ");"
sqlite::execute(db, sql)
sqlite::execute(db, "BEGIN TRANSACTION")
WHILE NOT EOF(1)
  sql = "INSERT INTO crime VALUES ("
  LINE INPUT #1, csvln
  csvln = CHOMP(csvln)
  SPLITAQ csvln BY "," QUOTE "" TO col
  FOR x = 0 TO lastcol
    IF typ[x] = "S" THEN
      tmp = "'" & col[x] & "'"
    ELSE
      tmp = col[x]
    END IF
    IF x <> lastcol THEN tmp &= ", "
    sql &= tmp
  NEXT
  sql &= ");"
  sqlite::execute(db, sql)
WEND
sqlite::execute(db, "END TRANSACTION")
sqlite::close(db)
CLOSE(1)
 

Output
jrs@laptop:~/sb/sb22/test$ time scriba csv2sql.sb real 0m0.763s user 0m0.457s sys 0m0.016s jrs@laptop:~/sb/sb22/test$ sqlite3 SQLite version 3.8.2 2013-12-06 14:53:30 Enter ".help" for instructions Enter SQL statements terminated with a ";" sqlite> .open sac116.db sqlite> SELECT COUNT(*) FROM crime; 7584 sqlite> .q jrs@laptop:~/sb/sb22/test$

Oxygen Basic

News:

Author Topic: DLLC (Read 37258 times)

JRS

Re: DLLC

Mike Lobanovsky

Re: DLLC

Charles Pegge

Re: DLLC

Charles Pegge

Re: DLLC

Charles Pegge

Re: DLLC

JRS

Re: DLLC

JRS

Re: DLLC

JRS

Re: DLLC

Charles Pegge

Re: DLLC

JRS

Re: DLLC

JRS

T.bas - CSV Line Formatter

JRS

Re: T.bas - CSV Line Formatter

Charles Pegge

Re: DLLC

JRS

Re: DLLC

JRS

T.bas CSV to SQLite3