Filter command truncate lines with more than 35770 chars

Anything beyond the basics in using the LiveCode language. Share your handlers, functions and magic here.

Moderators: FourthWorld, heatherlaine, Klaus, kevinmiller, robinmiller

Post Reply
paulclaude
Posts: 121
Joined: Thu Mar 27, 2008 10:19 am

Filter command truncate lines with more than 35770 chars

Post by paulclaude » Thu Mar 26, 2009 11:47 am

The Filter command truncates lines with more than 35770 chars (it returns only the first 35770 chars of a found line).

I think this is a bug and I would like to warn everyone about this limit, but before to submit the bug to the quality center, I would know if someone already has an explanation about it.

bn
VIP Livecode Opensource Backer
VIP Livecode Opensource Backer
Posts: 4174
Joined: Sun Jan 07, 2007 9:12 pm

Post by bn » Thu Mar 26, 2009 12:44 pm

Hi paulclaude,

the filter command works for me

Code: Select all

on mouseUp pMouseBtnNo
    put empty into tCollect
    repeat 3
        repeat with i = 1 to 60000
            put random(126 - 35 + 1) + 35 - 1 into tChar
            put numtochar(tChar) after tCollect
        end repeat
        put return after tCollect
    end repeat
    put "AB" after tCollect
    put return after tCollect
    put length(tCollect) & return into field 1
    filter tCollect with "*AB*"
    put length(tCollect) after field 1
    put return & tCollect after field 1
end mouseUp
Make a field and a button, set the script of the button the above script. It does what it is supposed to: the filter command returns all of the 4 lines because all 4 lines contain "AB". The length of the first 3 lines is 60000, as the script made them to be. No truncation.
If you still have trouble maybe you could post your code or make a small stack that you put on revOnline to see what is going on.
regards
Bernd
Last edited by bn on Thu Mar 26, 2009 12:56 pm, edited 1 time in total.

paulclaude
Posts: 121
Joined: Thu Mar 27, 2008 10:19 am

Post by paulclaude » Fri Mar 27, 2009 9:13 am

Hi Bernd,

you are right, it seems the things are a little more complicated. I've made a long text file (a list of songs) and I've made a sample stack to get a line with two methods (a simple chunk request and a filter command), with two different results. You can download it at:

http://web.tiscali.it/paulclaude/Filter_test.zip

Please put the text file on your desktop, and the script should find it and convert it, creating two new files.

Waiting for comments

Regards

Paul

bn
VIP Livecode Opensource Backer
VIP Livecode Opensource Backer
Posts: 4174
Joined: Sun Jan 07, 2007 9:12 pm

Post by bn » Fri Mar 27, 2009 12:53 pm

paulclaude,
I tried your example and indeed it gives different results. I dont really know why but one thing I noticed is that you use the URL binfile for text.
from the documentation
When you put data into a binfile URL or get data from it, Revolution does not translate end-of-line markers (ASCII 10 and ASCII 13) between the current platform's standard and Revolution's internal standard of a linefeed. This ensures that binary data, which may contain such characters, is not accidentally corrupted.

If you are working with text data (such as text in fields), use the file URL scheme instead.
Without knowing the source of the data and what you did with it I would try to do this without the binfile, (I tried a little with inconclusive results) Your original file is linefeed (ascii 10) delimited. Maybe you could send me the original file, as it is on your system and I will look at it Monday.

If you are fiddling around with iTunes may be you would like to take a look at Thomas MacGrawth's itunes library for revolution, an extensiv collection of commands to work with iTunes from rev.

http://www.lazyriversoftware.com/RevOne.html

Sorry, no straight answer. A little background on what exactly you try to do would help understand the problem.
regards
Bernd

paulclaude
Posts: 121
Joined: Thu Mar 27, 2008 10:19 am

Post by paulclaude » Fri Mar 27, 2009 2:21 pm

Hi Bernd,

I must use binfile because the original files are compressed. I've tried to change linedelimiter, or to replace linefeed, etc., but I always get the same result: the chunk command work, the Filter command no.

Thanks for the suggestion about the Thomas MacGrawth's itunes library: a very well structured collection of appleScript's routines. Unfortunately, for my work I rather would have need of a shell based iTunes lib, to read playlists, etc, directly from the iTunes Music Library.xml file.

Regards

Paul

bn
VIP Livecode Opensource Backer
VIP Livecode Opensource Backer
Posts: 4174
Joined: Sun Jan 07, 2007 9:12 pm

Post by bn » Sat Mar 28, 2009 12:49 am

Hi paulclaude,

I am a little lost. It is not your use of the binfile. I was on a train today and played around with your file for quite a while. The file seems ok, just the linefeeds and the tabs, so not 'strange' chars (i.e. below ascii 32).

I give up.

Maybe someone with more experience with regular expressions jumps in. I rarely use them, mostly for file type filtering. That is easy.
The more I was playing around with the regular expressions on your file the more I got the impression that you are right. There might be something going on in the sense of peculiarities, that I dont understand: I did not manage to change the original text file in a way that the filter command would give a consistent result, not to mention a sensible one. It went from bad to worse. But then again it might be because I dont really get the regular expressions.
I am afraid that I am not of any help here, but there are more experienced scripters around that might have a clue to what is going on.

If you could post in a more detailed way what you want (e.g what is firstchunk1100 chunk1155 ) may be it is easier to debug for anyone in the know.
Please let me know how it goes.
regards
bernd

paulclaude
Posts: 121
Joined: Thu Mar 27, 2008 10:19 am

Post by paulclaude » Sun Mar 29, 2009 9:15 am

I've posted this 'case' because I've no explanations too. I've also tried to find special characters, but it seems to be a normal file (firstchunk1100 or chunk1155 are only placeholders, you can change it with anything else, the result will be the same).

For the moment, I will use the chunk commands to retrieve the lines, but the problem may occurs to anyone, even on programs already published who manage large interactive files with the Filter command, so I hope someone solves.

Thanks for having tried, Bernd.

Cheers

Paul

bn
VIP Livecode Opensource Backer
VIP Livecode Opensource Backer
Posts: 4174
Joined: Sun Jan 07, 2007 9:12 pm

Post by bn » Sun Mar 29, 2009 11:11 pm

Hi paulclaude,
well the magic number is: 65535.

It turns out that the filter command has a limit on the size of a line. It is 65535.

I looked up the limits of revolution in the pdf. There it says "Maximum length of a line in a field 65,536 characters storage
No more than 32,786 pixels wide for display"

OK it does not say anything about variables and the fact that your script works fine with "line x of ......" shows that as long as you dont put your variable into a field the length of a line can be larger than 65535. Apparently not so for the filter command, in the example that I posted I only went up to 60000 for a line, as soon as you go above the magic number the filter command does not work anymore. That seems to be what you are running into. Unfortunately this does not seem to be documented. At least not in the Revolution User Guide PDF, nor in the dictionary.
So Revolution leaves something for user to find out....

But, knowing this, there are many ways to accomplish what you want. You could easily use the Offset function and then go on from there until you hit a return, that gives you the chars that make up your text.

Your way of going for the lines is probably ok as long as you dont try to display long lines in a field.

The Filter function would have been nice, but helas.
On the other hand Revolution puts so little limits on data size that one is surprised to run into one.

Just could not give up on this one and in the train on the way back...
regards
Bernd

paulclaude
Posts: 121
Joined: Thu Mar 27, 2008 10:19 am

Post by paulclaude » Mon Mar 30, 2009 8:15 am

Well Bernd,

it seems my first assumption was not so far from reality. It's a bug, since I knew the lines size limit of fields (the reason why I only used variables and files for testing), but the Filter command reach this limit also working on variables.

As I wrote before, I think it's important for developers to know this limit, because they could use the Filter command with files created 'on the fly' by their apps (as in my case), producing otherwise unpredictable errors.

Cheers

Paul

bn
VIP Livecode Opensource Backer
VIP Livecode Opensource Backer
Posts: 4174
Joined: Sun Jan 07, 2007 9:12 pm

Post by bn » Mon Mar 30, 2009 10:18 am

Hi paulclaude,
I think it's important for developers to know this limit, because they could use the Filter command with files created 'on the fly' by their apps (as in my case), producing otherwise unpredictable errors.
I definitely aggree. I reported it to the quality control center. In other words to bugzilla # 7864. Since it took you so much time you may want to add a comment or vote for it.

I am not shure it is a bug, since line filtering obviously is for filtering lines, what do you do with lines, eventually in most cases you want to display them. In that case the restriction to the max of the linesize is understandable.

But it should be documented.
regards
Bernd

FourthWorld
VIP Livecode Opensource Backer
VIP Livecode Opensource Backer
Posts: 10057
Joined: Sat Apr 08, 2006 7:05 am
Contact:

Post by FourthWorld » Mon Mar 30, 2009 2:16 pm

bn wrote:Hi paulclaude,
well the magic number is: 65535.

It turns out that the filter command has a limit on the size of a line. It is 65535.
The same limit had also been in place for the sort command, but was lifted in v3.5, which is currently in testing. If you're testing v3.5 you might want to see if the limit still applies to the filter command, and if so submit a request to have it raised.
Richard Gaskin
LiveCode development, training, and consulting services: Fourth World Systems
LiveCode Group on Facebook
LiveCode Group on LinkedIn

paulclaude
Posts: 121
Joined: Thu Mar 27, 2008 10:19 am

Post by paulclaude » Tue Mar 31, 2009 5:48 pm

bn wrote:I am not shure it is a bug, since line filtering obviously is for filtering lines, what do you do with lines, eventually in most cases you want to display them. In that case the restriction to the max of the linesize is understandable.
Bernd
I think it's a real bug, because it's an undocumented limit: many people (like me) may play with files (and long lines) only to use and display line items, so the error may be very bad and unpredictable.

I've commented your bugzilla report .

Cheers

Paul

Post Reply