Ah - this issue
A lot of people seem to think of the current behavior as a 'quirk' of the language, and indeed I did too up until a few months ago where I (finally?) had some time to review it (as a result of a bug report, although I think there have been others before it -
http://quality.runrev.com/show_bug.cgi?id=10727).
First of all, it is important to bound what we are talking about. The 'ignoring last empty element' of a string list issue only comes up when dealing with string lists that could contain an empty element - it has no effect (for obvious reasons) if the lists you are manipulating never contain empty elements... So, in what follows, when I say 'nullable string list' I mean 'string list which could contain empty elements'.
It is my present opinion that the current behavior of the engine when manipulating (nullable) string lists is not a bug nor it is not a quirk; it is simply how things have to be if one wants a logical and consistent semantic of working with them. Indeed, I really do think he HyperCard engineers thought very carefully about this over 26 years ago, and it is why they chose the semantic they did.
The issue comes, very simply, down to this. If you want to be able to represent a nullable string list of any number of empty items from 0 upwards then the trailing delimiter has to be ignored. i.e.
Code: Select all
put the number of items of "" -- is 0
put the number of items of "," -- is 1
put the number of items of ",," -- is 2
If you do not have the 'ignore the trailing delimiter' semantic then you have two choices: either you can represent lists of empty items of length 1 and upwards (option A):
Code: Select all
put the number of items of "" -- is 1
put the number of items of "," -- is 2
put the number of items of ",," -- is 3
Or you can't represent lists of empty items of length 1 (option B):
Code: Select all
put the number of items of "" -- is 0 (as empty should be empty and mean the empty list which means no elements)
put the number of items of "," -- is 2
put the number of items of ",," -- is 3
So, what's the problem with these two options...
Well, option (A) breaks the universal idea that 'empty' means nothing and thus auto-converts to the appropriate 'nothing' for the given data type that is requested (i.e. as a number it becomes 0, as a string it becomes "", and in the refactored engine - in a 'strictMode' - as an array it will become the array with no elements). Having this universal idea of empty (rather than a typed one) is, to my mind at least, an exceptionally useful part of the language and works very well with the typeless/contextually-typed nature of LiveCode and is something that I personally feel is well worth preserving.
Option (B) is, as far as I can see, unworkable - not being able to represent the list of one empty element creates a huge discontinuity that would require explicit script to work-around at every point you might want to deal with such a thing.
So, whilst it might be very well to propose ditching (or making optional) the 'ignore the trailing delimiter' semantic, the consequences of it have significant side effects which, based on my analysis, result in a worse situation code-wise than is proposed we have at the moment.
When it comes down to it, with the current semantics (trailing empty elements being ignored) it essentially means that everyone and everything must observe two very simple rules:
- If you are processing a string list that could contain empty elements then you must preserve the trailing delimiter.
- If you are producing a string list which may contain empty elements then you must make sure you have a trailing delimiter.
Therefore I think it is far more likely that the issue / quirkiness people perceive with the current rule is entirely down to either: these two rules not being observed correctly in some places in the engine and externals; or it not being articulated in the documentation etc. that these two rules are important to observe (if you are dealing with string lists that could contain empty elements) and thus meaning there is code that people have written that does not observe them.
Whilst I'm generally always for 'giving people choice', in this scenario I think it would be a very unwise move - it really is something that has to be a global semantic. Making the rule optional will create a significant interoperability problem - libraries and code that are written to except (nullable) string lists with the current rule will require explicit checks and extra code to make them work with libraries and code that are written without the current rule. To give an example...
Let's say you have the following function in a library stack written with the current rule:
Code: Select all
function mapList pList, pFunc
local tMappedList
repeat for each item tItem in pList
dispatch function pFunc to me with tItem
put the result & comma after tMappedList
end repeat
if char -1 of pList is not comma then
delete the last char of pList
end if
return tMappedList
end mapList
This function is simply a 'map' primitive, it applies pFunc to each item in the list, returning the result. It doesn't know whether pList could contain empty elements, so by rule (1) above it ensures it preserves the trailing delimiter.
Now, let's say a function is an object script elsewhere which is written without the current rule (here I'm assuming its a context local property):
Code: Select all
function typeItem pValue
if pValue is empty then
return "empty"
else if pValue is an integer then
return "integer"
else if pValue is a number then
return "real"
end if
return "string"
end typeItem
on testMapList
set ignoreTrailingDelimiter to false
put "1,2.5,foo," into tList
answer mapList(tList, "typeItem")
end testMapList
So, if I were writing 'testMapList', I'd expect an answer dialog to pop up and give me "integer,real,string,empty". However, what I will get is "integer,real,string,". Why? Well, with the 'ignoreTrailingDelimiter' set to false, I'm giving it a list of 4 items the last of which is empty; however, because I'm passing the list to a function that ignores trailing delimiters then it only sees three items.
Of course it would be possible to argue that the above examples suggests that 'ignoreTrailingDelimiter' should be passed down the callee chain, and in this simple case that would make things work. However, if 'mapList' were a function which were doing something a great deal more complicated such as making and manipulating its own (nullable) string lists or calling other functions that do then it (or the functions it calls) would have to explicitly check for and conform to the current setting of ignoreTrailingDelimiter (or turn it off and on as appropriate). Therefore, ultimately, allowing a choice means that a heavy burden is put on anyone wanting to share their code (whether in libraries, snippets, custom controls etc.) - they'd have to take into account the context in which it is being called.
[ You could also argue that the engine 'should know' that tList is a list built under ignoreTrailingDelimiter == false and so should be able to adjust as appropriate, but it can't know because lists are just strings. ]
Given that not having the ignoreTrailingDelimiter rule means either one of option (A) or option (B) above (which have significant flaws of their own), I have to ask - is it really worth it? If everyone and everything follows the rules required by ignoring trailing delimiters then (as far as I can see) everything works well, you don't have to juggle things when passing lists between code that is written to two different standards, and you don't have to deal with the fact that either 'the number of items of empty is 1' (A), or that you can't represent a list of 1 empty item (B).
Ideally of course, one would not have to worry about the two rules above at all - nor the flaws (A) or (B) of choosing a different way to handle trailing delimiters - however, I do believe it is unavoidable if ones lists are always strings. Thus, the real solution (beyond keeping the status quo but ensuring complete consistency of application of the two rules above) is having real lists - i.e. ones that are a structured data type. These will be added to the refactored engine (although perhaps not in the initial release) and will slide naturally into the current semantics we have. When we have these, I don' think anyone will ever really have need to consider any delimiter issues - as it won't be a relevant concern. [ Note that I'm not proposing changing existing functionality by introducing this new idea of 'real lists', you will be able to use them if you wish, or just carry on how you do things now. ]
Now, of course, all that being said the above is entirely my analysis of the situation and I'm more than happy for someone to prove me wrong in any part of it. More specifically, as much as someone above referred to defaulting to 'superstition' and such to defend the status-quo, it seems to me that the suggestion of changing the existing behaviour hasn't really been backed up by any demonstrations of how it would improve the ease of coding beyond 'that rule doesn't make sense, it should change'. So, what we really need is real world code examples which would support the proposed introduction of an option to switch the current semantic on and off which we can then analyse - evaluate it in the two contexts and try to determine absolutely whether there is an advantage to implementing the proposed addition. However, I must say, that it is my gut feeling that any such example will either demonstrate an area where one of the two rules (required by the current 'ignore trailing delimiter' rule) are not being observed (whether that be in the engine/externals, or in the code itself), or a situation in which string lists will never be sufficient to perform the intended purpose of the code.