Random Sort

Moderators: FourthWorld, heatherlaine, Klaus, kevinmiller, LCMark

Locked
LCMark
Livecode Staff Member
Livecode Staff Member
Posts: 1208
Joined: Thu Apr 11, 2013 11:27 am

Random Sort

Post by LCMark » Thu May 23, 2013 12:05 pm

Following a thread on the use-list (and resulting non-bug report 10917) how about a small addition to the sort command to ease randomization of lists:

Code: Select all

  sort [ ( lines | items ) of ] <container> randomly
It would be equivalent to:

Code: Select all

  sort [ ( lines | items ) of ] <container> by random(0xffffffff)

Janschenkel
VIP Livecode Opensource Backer
VIP Livecode Opensource Backer
Posts: 977
Joined: Sat Apr 08, 2006 7:47 am
Location: Aalst, Belgium
Contact:

Re: Random Sort

Post by Janschenkel » Thu May 23, 2013 1:48 pm

Sounds easier than adding a 'shuffle' command for the same purpose ;-)
Should we also have the ability to randomly sort the cards of a stack?

Jan Schenkel.
Quartam Reports & PDF Library for LiveCode
www.quartam.com

LCMark
Livecode Staff Member
Livecode Staff Member
Posts: 1208
Joined: Thu Apr 11, 2013 11:27 am

Re: Random Sort

Post by LCMark » Thu May 23, 2013 3:01 pm

There's no reason we couldn't have:

Code: Select all

sort [ cards of ] <stack> randomly
Also.

DarScott
Posts: 227
Joined: Fri Jul 28, 2006 12:23 am
Location: Albuquerque
Contact:

Re: Random Sort

Post by DarScott » Thu May 23, 2013 5:47 pm

I think the random sort problem we saw is a posthole that any of us could step into. I might have already and I think of myself as an experienced programmer. The qualifier "randomly" will allow instant success in a task that a lot of new users will want to do.

Of course, adding things should be done with trepidation, but I like adding 'randomly' much better than adding a new command. Though structure will help a lot, the dictionary is overwhelming for some folks.

Even though the two sorts are separate dictionary entries and have some important differences, they share features and concepts and, in many people's internal grammar, they are the same thing. Because of that, I think allowing 'randomly' in both places is simpler, language-wise. Or, in a sense, adding it to sorting cards is free, there is little or no cost in language complexity and maybe even a savings. So, if 'randomly' is added, it should apply to both.

In the cleaned up syntax, will this affect the use of 'randomly' as a variable, function or command? Does this change break scripts?

Is "sort randomly" an oxymoron to some people?

LCMark
Livecode Staff Member
Livecode Staff Member
Posts: 1208
Joined: Thu Apr 11, 2013 11:27 am

Re: Random Sort

Post by LCMark » Fri May 24, 2013 12:25 pm

@DarScott: Igor posted a bug report about the 'bias' people had noticed by using too small a constant to random() and it took me a while to figure out what was going on - which is why (for an operation such as this) it would be better to make it more explicit.

In terms of syntax - I've no real objection to adding perhaps a 'shuffle' command - but it seemed cleaner to suggest an extension to sort (since that is the command that people use already to perform this operation). I must confess I hadn't even considered the potential 'oxymoron' of 'sort randomly' - however, I think it still works quite well (and adds a bit of quirk value - which at least might make some smile!).

In regards to impact on the language - only tokens added to the factor_table affect what can be declared as a variable. Since the 'randomly' keyword would only be tested for in that exact context, it would have no wider effect on the language.

DarScott
Posts: 227
Joined: Fri Jul 28, 2006 12:23 am
Location: Albuquerque
Contact:

Re: Random Sort

Post by DarScott » Sat May 25, 2013 7:05 am

I'm thinking about the phrase "is equivalent to" in the description of 'randomly'.

It sounds like a definition.

Perhaps it can be a frank description of the current implementation for those who need to know. (Or a hint to those who want to write their own scrambler.)

I'm not sure how to word a definition in that case. Maybe an intuitive one is fine.

And implementers can be free to improve on that as growing wisdom indicates.

mwieder
VIP Livecode Opensource Backer
VIP Livecode Opensource Backer
Posts: 3581
Joined: Mon Jan 22, 2007 7:36 am
Location: Berkeley, CA, US
Contact:

Re: Random Sort

Post by mwieder » Tue May 28, 2013 6:44 pm

@runrevmark:
There's no reason we couldn't have:

Code: Select all

    sort [ cards of ] <stack> randomly
Why? I've never thought of sorting cards. What does this give me? And would they be sorted by name, id, number?

mwieder
VIP Livecode Opensource Backer
VIP Livecode Opensource Backer
Posts: 3581
Joined: Mon Jan 22, 2007 7:36 am
Location: Berkeley, CA, US
Contact:

Re: Random Sort

Post by mwieder » Tue May 28, 2013 6:50 pm

Code: Select all

      sort [ ( lines | items ) of ] <container> by random(0xffffffff)
Since fields are limited to 64k characters, that seems like a large random number. I do see where in the code this limit is checked, but I'm not sure why it's in place. Is this an OS limitation? An artifact of legacy code? Is there something that would break if we increased it?

DarScott
Posts: 227
Joined: Fri Jul 28, 2006 12:23 am
Location: Albuquerque
Contact:

Re: Random Sort

Post by DarScott » Tue May 28, 2013 7:54 pm

(I don't think fields are limited to 64,000 characters, perhaps you meant something else. I'll assume 64,000 lines or items.)

I haven't looked at the code. Traditional random number generators use some arithmetic to change the state of the the generator (usually an unsigned int), and then use mod to scale that to the needed range. The appropriate conversion from floating point is to cap.

If the function really returned a random 32-bit number, then there is about 50% chance of no duplicates in assigning numbers to 64,000 items/lines. (Think birthday problem or hash collisions.) There is likely to be several in a list of 200,000 items, but I don't think you will be able to measure the impact in 'randomly' without millions of trials (or more, I didn't do the math).

However, the traditional random number generator does not have a random state, but its state follows and then cycles through some path of numbers. The size of the final ring is smaller than 2^32 for 32-bit generators. You are not going to get duplicates for 200,000 item lists. I don't know what computational problem is like to guess the last part of a random sort based on the pattern of the first part when the sort is of a list with a known order and the generator is known. However, if you have a generator that has a 64-bit internal state, then you can get duplicates when mod about 32 bits, but it is harder to predict the outcome from a partial outcome. (This is "can get duplicates" in a theoretical sense.)

It might be nice if there is a function that returns values from a wide range of numbers that are evenly distributed based on '=' and the underlying number system. That range can grow as numbers grow or the random number generator grows. In might be random() with no parameters which returns numbers between 1 and n where n is big and can grow over versions of LiveCode. (No parameters for random() is now a compile error, so scripts won't break in adding this.) Or a new function might return numbers between 0 and 1 ("between" defined later). If arbitrary precision comes, then that would be limited the same way that it is in 1/3. Anyway, maybe 'randomly' can be defined in terms of that, and "that" will adapt as LiveCode grows.

mwieder
VIP Livecode Opensource Backer
VIP Livecode Opensource Backer
Posts: 3581
Joined: Mon Jan 22, 2007 7:36 am
Location: Berkeley, CA, US
Contact:

Re: Random Sort

Post by mwieder » Tue May 28, 2013 8:07 pm

@Dar- I'll check the code again, but I could swear there's a check to make sure a malloc isn't going to allocate more than 64k bytes.
The other stuff - point taken. I'm still feeling jetlagged and caffeine-deprived, and you're right about the duplication.

LCMark
Livecode Staff Member
Livecode Staff Member
Posts: 1208
Joined: Thu Apr 11, 2013 11:27 am

Re: Random Sort

Post by LCMark » Wed May 29, 2013 10:11 am

@mwieder:
Why? I've never thought of sorting cards. What does this give me? And would they be sorted by name, id, number?
The ability to sort cards is there as part of the functionality for using stacks as (card-based) databases. You can 'sort cards of <stack> by <sortKey>'.
Adding 'sort cards of <stack> randomly' would be just ensuring consistency and completeness of syntax (in this case it would just randomise the order of the cards in the stack - i.e. sort randomly by 'the number').
Since fields are limited to 64k characters, that seems like a large random number. I do see where in the code this limit is checked, but I'm not sure why it's in place. Is this an OS limitation? An artifact of legacy code? Is there something that would break if we increased it?
Paragraphs are limited to 64k but you can have as many paragraphs as you like in a field. This is an engine limitation which could be increased but requires both going through all the field code checking the index computations and a change in file-format which is why it's still there. It was always intended that we'd remove this limitation when we redid the paragraph object part of the field to better support complex text rendering (in particular, right-to-left language support).

In regards to the size of the random number, then computationally it doesn't matter whether its random(10) or random(10000000) - each random is generated in a fixed number of steps regardless of the magnitude of the upper limit. Indeed, the larger the random number the better for sorting randomly since it ensures it minimises the number of collisions.

mwieder
VIP Livecode Opensource Backer
VIP Livecode Opensource Backer
Posts: 3581
Joined: Mon Jan 22, 2007 7:36 am
Location: Berkeley, CA, US
Contact:

Re: Random Sort

Post by mwieder » Wed May 29, 2013 5:57 pm

Ah - card-based databases. It's been a while since I've thought of that paradigm.
I can see how "sort randomly" would be useful for a flash-card application, like "go to any card".

Locked

Return to “Engine Contributors”