Returns in a column
Moderators: FourthWorld, heatherlaine, Klaus, kevinmiller, robinmiller
Returns in a column
Hi Everyone
A question regarding a .csv file. I have a spreadsheet with a number of single word columns. One column however is a ‘Notes’ column with text and returns.
In Excel, the ‘Notes’ data is displayed and contained within the column. When this is imported into LiveCode each return creates a new line. So instead of 20 entries, there are 32, for example.
I have tried different delimiters for each item, however the return continue to do what it thinks it should do!!
Is there a way to ‘contain’ the text and return data in one item in an import?
Thanks
Greg
A question regarding a .csv file. I have a spreadsheet with a number of single word columns. One column however is a ‘Notes’ column with text and returns.
In Excel, the ‘Notes’ data is displayed and contained within the column. When this is imported into LiveCode each return creates a new line. So instead of 20 entries, there are 32, for example.
I have tried different delimiters for each item, however the return continue to do what it thinks it should do!!
Is there a way to ‘contain’ the text and return data in one item in an import?
Thanks
Greg
Re: Returns in a column
Heya Greg,
What is actually separating the columns? commas, tabs, some other symbol?
What is actually separating the columns? commas, tabs, some other symbol?
Re: Returns in a column
Hi Greg,
that is really a pain in the ass!
Unfortunately there is no easy solution for this problem, but I remebered this link, which give you a working function for this incinvenience: https://www.fourthworld.com/embassy/art ... t-die.html
I really dig the headline!
Best
Klaus
that is really a pain in the ass!
Unfortunately there is no easy solution for this problem, but I remebered this link, which give you a working function for this incinvenience: https://www.fourthworld.com/embassy/art ... t-die.html
I really dig the headline!
Best
Klaus
Re: Returns in a column
LOL ! I remember when I read that, and it is still a good read today
I still think using null characters is the way to go with making a csv. After all, they can't be typed in on a keyboard, and most people would have no idea how to create one.
You might also want to look at this thread, and in particular this post by Thierry in particular for another take on it.
I still think using null characters is the way to go with making a csv. After all, they can't be typed in on a keyboard, and most people would have no idea how to create one.
You might also want to look at this thread, and in particular this post by Thierry in particular for another take on it.
Re: Returns in a column
Hi,
abusing a spreadsheet for layout or database tasks is the first step toward the swamps of sorrow ...
Additionally, Excel (at least my 2003 here) has poor support for .csv (Character Separated Values).
Still you're not lost:
2 modes, switched whenever you encounter [field delimiter & quote] (= In Field) and
[quote & field/record delimiter] (= Out of Field).
When "in Field", change the dangerous chars (that are delimiters outside of it) to something innocent (•¶Ÿ¥¤¦¬ ...).
Quick & dirty code example (not really tested!):
Here I only changed hazardous characters inside the text fields, and removed the quotes Excel used to designate these. Feel free to change these again when inserting the data in the appropriate fields of your GUI.
Notice that ";" and CR (the Excel delimiters), are now only used as such anymore, and can be quickly replaced to anything more desirable, at the end of the script.
When it comes to ASCII delimiters, look here, I usually suggest ASCII 28 - 31, as these are designated as such! But the ones I used work well, too (and are visible ;-) ).
Have fun!
abusing a spreadsheet for layout or database tasks is the first step toward the swamps of sorrow ...
Additionally, Excel (at least my 2003 here) has poor support for .csv (Character Separated Values).
Still you're not lost:
- When saving as .csv (comma/semicolon separated) or .txt (tab separated) Excel will quote fields that contain 'dangerous' chars.
So your task is to repeat through the text until you find [delimiter& quote], from there you're "in a field", until you find the closing [quote & delimiter]. And so on.
Not elegant, but rather fast.
. - Or you read your Excel .csv into Access, and export it from there - in Access you can specify the separator chars. (But Access will replace your CR with empty)
. - Or you could use ODBC to connect to your Excel sheet, and use it with the revdb functions. With similar problems, I assume.
2 modes, switched whenever you encounter [field delimiter & quote] (= In Field) and
[quote & field/record delimiter] (= Out of Field).
When "in Field", change the dangerous chars (that are delimiters outside of it) to something innocent (•¶Ÿ¥¤¦¬ ...).
Quick & dirty code example (not really tested!):
Code: Select all
put ";" into myFD -- this is what Excel gives me as FD
put CR into myRD
put "•" into myReplFD -- new field delimiter in text field
put "¬" into myReplCR -- new CR in text field
put "¤" into myReplQuote -- temp. char for unused quotes, to kill later
put URL "file:D:\_Development\Mappe1.csv" into myData
put false into myInFld
repeat with i = 1 to len(myData)
put char i of myData into thisChar
if not myInFld then -- we're outside a quoted field:
if (thisChar = myFD) and (char i+1 of myData = quote) then -- and found the beginning of a quote
put true into myInFld -- switch mode
put myReplQuote into char i+1 of myData -- prepare quote for killing
end if
else -- we're inside a quoted field:
if (thisChar = quote) and \
((char i+1 of myData = myFD) or (char i+1 of myData = myRD)) then -- and found the end of a quote
put false into myInFld -- switch mode again
put myReplQuote into char i of myData -- prepare quote for killing
else
if (thisChar = myFD) then put myReplFD into char i of myData -- replace FD inside text fld
if (thisChar = myRD) then put myReplCR into char i of myData -- replace RD inside text fld
end if
end if
end repeat
replace myReplQuote with empty in myData -- kill unused quotes
put myData
Notice that ";" and CR (the Excel delimiters), are now only used as such anymore, and can be quickly replaced to anything more desirable, at the end of the script.
When it comes to ASCII delimiters, look here, I usually suggest ASCII 28 - 31, as these are designated as such! But the ones I used work well, too (and are visible ;-) ).
Have fun!
All code published by me here was created with Community Editions of LC (thus is GPLv3).
If you use it in closed source projects, or for the Apple AppStore, or with XCode
you'll violate some license terms - read your relevant EULAs & Licenses!
If you use it in closed source projects, or for the Apple AppStore, or with XCode
you'll violate some license terms - read your relevant EULAs & Licenses!
-
- Livecode Opensource Backer
- Posts: 9388
- Joined: Fri Feb 19, 2010 10:17 am
- Location: Bulgaria
Re: Returns in a column
Just run through you "guff" as it is imported and replace the carriage returns with something else . . .
Re: Returns in a column
Thank you all for your replies. I have been wandering around in this 'swamp of sorrow' for some time now!
I will continue to see if I can get out, but I need some more assistance, before I go under
AxWald, I liked your approach, but couldn't get your script to work. Am I missing something?
I also enjoyed Richard's article and all the discussion that followed so many years ago!
I still have some things to try, but if there are any scraps of food available to feed the crocodiles in the swamp, they will gratefully be accepted.
Cheers
Greg
I will continue to see if I can get out, but I need some more assistance, before I go under
AxWald, I liked your approach, but couldn't get your script to work. Am I missing something?
I also enjoyed Richard's article and all the discussion that followed so many years ago!
I still have some things to try, but if there are any scraps of food available to feed the crocodiles in the swamp, they will gratefully be accepted.
Cheers
Greg
-
- Livecode Opensource Backer
- Posts: 9388
- Joined: Fri Feb 19, 2010 10:17 am
- Location: Bulgaria
Re: Returns in a column
I have a feeling that you need to replace your CR chars with something like a SPACE.
HOWEVER, as far as I recall, there are several different Unicode addresses that render as a CR . . .
The 'standard' CR is U+000D
This is slightly unpleasant:
https://stackoverflow.com/questions/309 ... -form-feed
You have, certainly got to work out if your problem is with CR or LF (line feed).
'standard' LF is U+000A
Here's something lumpy as well:
"What is CRLF and LF?
The term CRLF refers to Carriage Return (ASCII 13, \r) Line Feed (ASCII 10, \n). ... For example: in Windows both a CR and LF are required to note the end of a line, whereas in Linux/UNIX a LF is only required. In the HTTP protocol, the CR-LF sequence is always used to terminate a line."
One thing that I don't understand is HOW (except, possibly by Copy-Paste) text with CR chars was inserted into spreadsheet cells. Certainly, in CALC (LibreOffice) hitting the ENTER key takes me to another cell in the spreadsheet.
Um, possibly: "By default key enter finishes editing, to let users to insert LF by Enter key, set AcceptEnters='1', the editing will be finished by Ctrl+Enter or Shift+Enter."
HOWEVER, as far as I recall, there are several different Unicode addresses that render as a CR . . .
The 'standard' CR is U+000D
This is slightly unpleasant:
https://stackoverflow.com/questions/309 ... -form-feed
You have, certainly got to work out if your problem is with CR or LF (line feed).
'standard' LF is U+000A
Here's something lumpy as well:
"What is CRLF and LF?
The term CRLF refers to Carriage Return (ASCII 13, \r) Line Feed (ASCII 10, \n). ... For example: in Windows both a CR and LF are required to note the end of a line, whereas in Linux/UNIX a LF is only required. In the HTTP protocol, the CR-LF sequence is always used to terminate a line."
One thing that I don't understand is HOW (except, possibly by Copy-Paste) text with CR chars was inserted into spreadsheet cells. Certainly, in CALC (LibreOffice) hitting the ENTER key takes me to another cell in the spreadsheet.
Um, possibly: "By default key enter finishes editing, to let users to insert LF by Enter key, set AcceptEnters='1', the editing will be finished by Ctrl+Enter or Shift+Enter."
Re: Returns in a column
Hi,
When I paste my code into a button & run it, msg contains:
Remember:
In this case you better start the code this way:
Another culprit may be the line/ record delimiter. My text editor shows me the line delim in the text field as LF, whereas the records are delimited with CRLF. My use of CR in the code should take care of this (CR/ return has shown as rather robust, for me)
Then, it may be that you're using a bleeding edge version of LC that has a bug/ regression - I write with 6.7.10 & tested with 9.0.1, and it worked.
Hope this helps, have fun!
It may be that your Excel writes different .csv's. My demo was this: Saved as .csv it looks like this:
Code: Select all
a;"This is
a text.";c
1;2;3
Code: Select all
a;This is¬a text.;c
1;2;3
It may be that your Excel writes the csv with comma - it may be that in English based language settings this is used. After all, me, based in western Europe, never ever encountered one of the real comma-delimited .csvs that Richards condemns with such ferocity ;-)
In this case you better start the code this way:
Code: Select all
put comma into myFD
-- put ";" into myFD -- this is what Excel gives me as FD
Then, it may be that you're using a bleeding edge version of LC that has a bug/ regression - I write with 6.7.10 & tested with 9.0.1, and it worked.
Hope this helps, have fun!
All code published by me here was created with Community Editions of LC (thus is GPLv3).
If you use it in closed source projects, or for the Apple AppStore, or with XCode
you'll violate some license terms - read your relevant EULAs & Licenses!
If you use it in closed source projects, or for the Apple AppStore, or with XCode
you'll violate some license terms - read your relevant EULAs & Licenses!
Re: Returns in a column
Thank you so much AxWald for your very clear explanation. Much appreciated.
That was the problem. I am in Australia, so there are many commas running wild here. I have decided to use OpenOffice, as it provides more control over what csv’s will look like when saved. I’ve abandoned Excel, as it was stripping out any European characters in the spreadsheet too (not so commonly used here in OZ).
You have given me much to play with, so I will see if I can put this issue to rest (for me at least), but I may be back
When I ran your solution with my database (over 11,000 entries, but over 33,000 lines when imported!!), it wasn’t happy and I needed to stop the import after more than 10 minutes.
Regards
Greg
That was the problem. I am in Australia, so there are many commas running wild here. I have decided to use OpenOffice, as it provides more control over what csv’s will look like when saved. I’ve abandoned Excel, as it was stripping out any European characters in the spreadsheet too (not so commonly used here in OZ).
You have given me much to play with, so I will see if I can put this issue to rest (for me at least), but I may be back
When I ran your solution with my database (over 11,000 entries, but over 33,000 lines when imported!!), it wasn’t happy and I needed to stop the import after more than 10 minutes.
Regards
Greg
-
- VIP Livecode Opensource Backer
- Posts: 9842
- Joined: Sat Apr 08, 2006 7:05 am
- Location: Los Angeles
- Contact:
Re: Returns in a column
IIRC, the field enhancements in v5.5 include rendering ASCII 11 as Vertical Tab.
Richard Gaskin
LiveCode development, training, and consulting services: Fourth World Systems
LiveCode Group on Facebook
LiveCode Group on LinkedIn
LiveCode development, training, and consulting services: Fourth World Systems
LiveCode Group on Facebook
LiveCode Group on LinkedIn
Re: Returns in a column
Hi,
where the "¬" are CRs.
Records: ~300000; Size resulting: ~ 15 MB
My script (as in my prev. post) runs it in ~8 - 10 seconds on my venerable HP business desktop. So you obviously have huge records, and the data file must be in the GB area.
(Not to mention that my script fails if the first field in a record is a text field ...)
I tried some other possibilities (using offset() and such ...) and all was ways slower.
We could speed it up using a "repeat for each char" approach (building an output variable during the loop), but this would also double the RAM requirements. And I assume we're already tight here?
What would I do? Well, .csv's usually come from databases, or can be fed to them.
The export as this here (mySQL dialect):
will save you a lot of hassle. You may replace 'CHAR(10)' with 'CHAR(13)' or 'CHAR(13,10)', depending of your data.
If you have access to a database on a real database server, this is not noticeable slower than a normal export.
Have fun!
Ooops. Wanted to do some timing, and created a file like this:
Code: Select all
"zero¬one",two,three,four,five,six,seven,eight,nine
zero,"one¬two",three,four,five,six,seven,eight,nine
zero,one,"two¬three",four,five,six,seven,eight,nine
...
Records: ~300000; Size resulting: ~ 15 MB
My script (as in my prev. post) runs it in ~8 - 10 seconds on my venerable HP business desktop. So you obviously have huge records, and the data file must be in the GB area.
(Not to mention that my script fails if the first field in a record is a text field ...)
I tried some other possibilities (using offset() and such ...) and all was ways slower.
We could speed it up using a "repeat for each char" approach (building an output variable during the loop), but this would also double the RAM requirements. And I assume we're already tight here?
What would I do? Well, .csv's usually come from databases, or can be fed to them.
The export as this here (mySQL dialect):
Code: Select all
SELECT ID, REPLACE(aText, CHAR(10), '¬') AS newText FROM `t_test` ...
If you have access to a database on a real database server, this is not noticeable slower than a normal export.
Have fun!
All code published by me here was created with Community Editions of LC (thus is GPLv3).
If you use it in closed source projects, or for the Apple AppStore, or with XCode
you'll violate some license terms - read your relevant EULAs & Licenses!
If you use it in closed source projects, or for the Apple AppStore, or with XCode
you'll violate some license terms - read your relevant EULAs & Licenses!
Re: Returns in a column
I have been trying to apply Alex Tweedly’s script but am struggling to understand how I use it. I haven’t required a custom function before so have been reading and exploring how they are constructed.
Unfortunately I can’t work out how to call this script or what ‘defaults’ the script is referring to. Therefore I keep on bumping into a brick wall (a few actually!).
Any assistance to get me out of this dark space will be greatly appreciated. (i.e., how to refer to the function in a button script, and what defaults are needed - the tab delimited file, I can do!)
Thanks
Greg
Re: Returns in a column
Hi All
I have been exploring AxWald’s approach to removing returns in an item and am having success at removing the returns from imported data, adding data to this item, then replacing the returns on export - to be opened in OpenOffice (then Joomla).
I know that item 15 in the import has returns within the item, so only need to convert this item only.
With small numbers of individual items (abut 4,000 in total), the conversion takes about 7 seconds. When the number of items in the data comes larger, (6,000) the conversion takes about 40 seconds. Over 70,000 items takes over 10 minutes. (There are over 180,000 items in the current database, so the conversion is not practical.)
Is there a way of speeding up this script?
Cheers
Greg
I have been exploring AxWald’s approach to removing returns in an item and am having success at removing the returns from imported data, adding data to this item, then replacing the returns on export - to be opened in OpenOffice (then Joomla).
I know that item 15 in the import has returns within the item, so only need to convert this item only.
With small numbers of individual items (abut 4,000 in total), the conversion takes about 7 seconds. When the number of items in the data comes larger, (6,000) the conversion takes about 40 seconds. Over 70,000 items takes over 10 minutes. (There are over 180,000 items in the current database, so the conversion is not practical.)
Is there a way of speeding up this script?
Code: Select all
on mouseUp
set the itemDelimiter to ";" -- each item is also quoted
put field "Data" into tDataBaseHold -- there are 16 items for each student entry
put the number of items of tDataBaseHold-1 into tHowManyItems -- minus 1 works?
put the round of (tHowManyItems / 15) into tStudentNumbers -- get the number of student entries in the database
-- want the 15th item for each student (a data area with returns within the item)
put 0 into tCounter
repeat for tStudentNumbers
add 1 to tCounter
put tCounter * 15 into tNextSet -- item 15 of the next student
put item tNextSet of tDataBaseHold into tSecond
repeat for each char tChar in tSecond
if tChar is return then
replace return with "¬" in tChar
end if
put tChar after tCharHold
end repeat
put tCharHold into item tNextSet in tDataBaseHold
end repeat
put tDataBaseHold into field "Result"
end mouseUp
Greg
Re: Returns in a column
Interesting how writing it down and seeing the information in another format can help!!
I have answered my own question!
I'll now run some tests.
I have answered my own question!
Code: Select all
repeat for tStudentNumbers
add 1 to tCounter
put tCounter * 15 into tNextSet -- item 15 of the next student
put item tNextSet of tDataBaseHold into tSecond
replace return with "¬" in tSecond
put tSecond into item tNextSet in tDataBaseHold
end repeat