( @(user) ) Login/Signup Logout
The Dao Programming Language
for Scripting and Computing

Home Documentation Download Blog Forum Projects Demo
Latest News
Label: ♦english ♦feature planning

[408] Ideas for a new type: enum

Comment
The basic idea is to extend the original enum syntax, and allow enum to be type with a set of field names and values:
  1. the type name will be in form:
    enum<both=0,file=1,dir=2>
    # or simply
    enum<both,file,dir>;
  2. the original enum syntax can stay the same or extended:
    enum { AA, BB } # will be equivalent to enum<AA,BB>;
    enum EE{ AA, BB } # will be equivalent to enum<AA,BB> with alias EE;
  3. the using can resemble ruby symbol (but with different prefix): $name , which can be considered free enum , that is to say, it can be passed to any enum type which has a field named name :
    enum E1{ AA = 1, BB = 2 } # enum<AA=1,BB=2>
    enum E2{ AA = 10, BB = 20 } # enum<AA=10,BB=10>
    
    e1 : E1 = $AA;
    e2 : E2 = $AA;
    Here e1 will have value 1 , and e2 will have value 10 , so one can just use $AA for convenience.

With such enum type, it will eliminate the need to define some constants just for some function parameters, like stream.seek() , which can be redefined as:
seek( self :stream, pos :int, from : enum<begin,current,end> )=>int
So that only $begin , $current or $end can be passed as the from parameter to seek() . This can be guaranteed by the Dao typing system!
Comments
[409408] Done
Click to expand
(fu, 1, 2010-10-24, 08:35:46) Comment
And updated to the repository on dao.googlecode.com .
That's an interesting improvement! But I also propose to allow the following syntax:
enum E {a, b, c}
x = E.a;
That seems very useful to me, as it could make code more clear in certain situation. Particularly when there are a lot of enums with simple symbol names. This would be "enum class" which (just by it's name) could provide information about it's metatype (field of usage, to say it simply). As "Font.courier" is somewhat better than "$font_courier", while just "$courier" may have quite ambiguous meaning (depending on context).
    ....................................
Hmm, as the old-style enums seem to be deprecated, I suppose this feature would eventually be inconsistent and even controversial to the new syntax. Probably it is then better to leave the things simple. Besides, I doubt readability may be such a problem in this case to provide a new syntax.
A new crash-bug related to enums:
i: int = $a; #a grave mistake...

Found a serious bug: a variable of enum type cannot be properly compared with enum symbol at all:
x = $a;
if (x == $a) #false

   ...

switch (x)
{
   case $NOT_a: #true?!

      ...
}

if (x == x) #false again!

   ...
if ($a == $a) #the same as above

   ...
    .....................
By trial and error I've found a way to handle enums at last -- converting them to "int". But still, above cited code should work, I suppose.
And this seems like another defect:
routine p(x: enum<a, b>){}
p($c);  #works fine though the argument should be incompatible

Finally, what is the purpose of old-style enums now?
enum {a, b, c}
x = a; #error 'Symbol not defined'
    ...........................
It seems like a leftover from the old syntax (without enum name), so such definition just should be banned. By the way, is the named enum syntax really needed as well? One can just define a type alias for "enum<x, y, z>"...
[415410] ...
Click to expand
(fu, 1, 2010-10-24, 22:49:46) Comment
I have thought about this too. For these problems you mentioned in your other comments, I began to realize that they haven't been properly handled just after committing, but I was too tired to fix them yesterday :(, I will fix them as soon as possible.
    [I've just modified some of my old comments]
And another thought: Why do we need to bind integers to enum symbols? If they may be treated as a different data type, I doubt it is needed. They may be just specially handled implicit strings. At least, I see no reason to type "if ((int)x == 1)" when I can type "if (x == $value)". But maybe I missed something important?
Now fixed.
There are two reasons to bind integers to enum symbols:
  1. Convenience. consider this:
    stream.seek( self :stream, pos :int, from :enum<begin,current,end> )=>int
    If there is integer bound to the enum, instead of checking the string value, we can do:
    int where = SEEK_CUR;
    switch( p[2]->v.e->id ){
    case 0 : where = SEEK_SET; break;
    case 1 : where = SEEK_CUR; break;
    case 2 : where = SEEK_END; break;
    }
    or we can even simply use array:
    int options = { SEEK_SET, SEEK_CUR, SEEK_END };
    int where = options[ p[2]->v.e->id ];
    This is a lot easier than just use string:)
  2. Interface with C/C++ library. Enums in C/C++ library will be eventually wrapped as Dao enums, binding a integer to enum symbol will make things a lot of easier.

Unnamed enum defined by enum {a, b, c} will no longer be supported, it has become useless.

Named enum syntax is needed, because there is a minor difference from defining an alias for enum<x,y,z> . That is enum syntax allows constant folding:
const C = 4
enum EE
{
    AA = C<<1,
    BB = C<<2
}
Constant folding is not possible in type names.
Alright, but mostly I mean this: why should enum symbols be explicitly assigned to integers in Dao code. Implicit numbering is enough to handle them easily on the API level, while I don't see a necessity to handle enums as numbers on the Dao script level. On this, higher level, they could be just symbolic literals -- it's simpler and nicer, I suppose. As all the examples you cited above matters for the internal level only, why again should this be explicit on the language level?
Maybe sometimes one needs to use enum to define bit flags or flags with specific values. And sometimes it is also good have direct mapping from the low level to the high level, if we want to interfacing well with C/C++.
I think this is a really, really good feature, allows us to make the API clear :) I also like it that we can just re-use a same name with no problem!
And that's why you really don't need "Font.courier" syntax: in loadFont($courier) and me.follow($courier) $courier can be two totally different things, but both make sense.

As for not hiding the underlying integer, I am not decided. The only reason I see for showing it to the programmer is to make bitfields. And if one really wants to hide the integer, one could even add a "enum bit" or "bitfield" or whatever. But I have no opinion on this one.
What is this "operator to get field of enum type" mentioned in the recent enum-fixing revision? Is this a new syntax for something? As I don't have even a vague guess of how may such thing look like... :)
Apropos of bit flags: how can new enums represent compound values? One just cannot type "$a | $b" or "$a + $b" (while converting them to "int" is meaningless in this context); however, a list of enums may be used like a set of flags: "{$a, $b}".
Yeah, we can even redesign some of the existing API to take advantage of this new feature.
It's just what you mentioned in comment 410:
enum E{A,B}
x = E.A 
# or:
x = E::A

Now these operations are not yet supported, but they can be supported easily. But currently I am trying to figure out if there is a convenient way to support flag set as enum, or simply add another type for this.
Another reason is that it is necessary to allow the bound integer to appear in type names. Suppose we have two enum types where the same names are bound with different integers, without the integers in their type names, it will be quite ambiguous if they are the same or actually different. Then if integers are allowed in the type names, it is natural to allow them in the enum syntax.
Hmm, I'm not sure about that syntax's consistency... Maybe something like "$E.A"? Then it would be obvious in any context that this refers to an enum.
Well, $X should be referring to a symbol only in standard context (in macro, it's for macro variables), E.A is just like accessing a class member, is less confusing than $E.A where one may wonder what exactly is $E here.
I think TheTrueNightWalker is right, instead of bitfields, one might just use a list of enums.
Also please keep in mind that one certainly will want to keep an enum in mind and use it later, something along the lines of this (untested):
class Display {
    ratio : enum<standard, wide>
    # ...
    routine update() {
        # someRoutine signature: routine<ratio : enum<standard,wide> => any>
        someRoutine(ratio);
    }
}

d = Display{wide};
d.update();
I did not try this, just want to draw your attention to this use case.
I am thinking to support another type for flags or bitfields, now what I am not sure is whether to add a new keyword flag or to twist the use of enum:
a : flag<AA,BB> = $AA
# or
a : enum<AA|BB> = $AA

a += $BB # add flag BB
a -= $AA # remove flag AA
a = $AA + $BB # both flags, like AA|BB in C
if( $AA in a ) ...
The problem with this way of using enum is that it doesn't look consistent with anything else in the language. But supporting this type is also quite beneficial (e.g. it can make some API simpler just as enum does). And it should be simpler and more efficient than use list of enums to do the same thing. I prefer to add the flag keyword, maybe I will do so.
There may be other alternatives, so it should be considered well before commit. And I do have certain idea: let's provide a set type! Perhaps, similar to this:
s: set<$a, $b, $c>; #it can hold only this enums as elements

s = {$a, $b};     #{actually, here would be something not ambiguous with other types; I just don't know what yet #}
if ($a in s) ...
s += $a;         #$a wouldn't be duplicated, it's a set after all

s -= $a;
for (element in s) ...

#What may seem even more interesting: set could support non-enums as well

#And perhaps even filtering expressions!

s3: enum<1 ... 10>;  #a restricted range?

s4: enum<"%a">;    #a set of regex-filtered strings?
It's not a concrete proposition, though, -- just a fantasy of mine :)
I think set type should be for real set. Here enum and flag are meant to be simple tiny types:), which don't do much on its own, but provided to make other things more convenient.
I will go with type form and syntax such as enum<AA;BB> and enum Flags{ AA; BB } for flags.
I do prefer the "flag" keyword, as it really makes it obvious what is being done, and what to use it for. I am already confused by all the different symbol syntaxes for map, hashmap, tuple, ... so that would confuse me even more :D
As for me, supporting "flag" seems somewhat redundant. Besides, it's name say nothing concrete about it's usage and may be even more confusing. I suppose if a new (formal) type is to be added, it should be used for something more than just multi-enums.
I've come up with another idea. We could support bit flags on top of existing bitarray type, as they are related by meaning:
bits: bitarray;            #a usual bitarray

enums: bitarray<a, b, c>;  #a fixed-size bitarray with 3 named bits

enums = $a + $b;
enums[$c] = 1;             #setting a bit

enums += $c;               #the same as above

if (enums[$c]) ...         #checking a bit
The advantages:
  • new keyword is not needed;
  • meaningful declaration;
  • the interface is essentially the same as of bitarray (while "enum<a, b>" and "enum<a; b>" would have different interfaces causing inconsistency);
  • the usage is more or less intuitive (similar to arrays);
  • easily distinguishable from a usual bitarray ("enum<a, b>" and "enum<a; b>" look almost the same);
  • would increase the overall significance of bitarray type.

Actually, bitarray is not a keyword, it's just a internally defined type name (like thread, mutex etc.). To use bitarray in this way, it will need to be treated as a new keyword. And also, it can not be used directly as flags in C level. Now I will implement the flag type first, and then decide what type name to use later:)
Yeah, though I didn't welcome new formal type before, I implicitly proved it's necessity in my previous post :) Regarding bitarray: I investigated it a little, and it appeared to be just an alias for "long" type, nothing more. Why then bitarray is supported as type at all?
The type object dao_array_bit is used to inform the VM to mark a long integer object as bit array, and handle it properly. This solution is not elegant, but is very convenient. It was created quickly when I was preparing a demo long time ago.

Maybe we can simply support bit array as 2-base long integer, and allow them to be created by:
bits = 101001110L2 # L2 for base 2
# or:
bits = 101001110B
Similar to the use of L2 , one may also use:
b3 = 102001210L3 # L3 for base 3
b8 = 701005110L8 # L8 for base 8
b16 = 109001A10L16 # L16 for base 16
The only problem is these suffixes do not seem to be very visible. Maybe 101001110LB2 or 101001110LX2 is better?
Or just 101001110X2 , so that there will be no problem to support up to 32 base numbers.
Hmm, I'm not sure whether arbitrary base should be supported, but adding binary literals (and "long" hexadecimal ones, by the way) sounds good indeed. And it seems better for me to use "Ln" postfix for such things ("L2", "L16") which would indicate that it's "long", not "int".
So, as now we have enum with combinable flags, do we really need explicit number assignment to enum symbols? I think it may already be excluded from the syntax without any functionality losses.
And I have a new idea concerning enum declaration syntax and functionality:
flags: enum< {a, b} >; #or 'enum< (a, b) >', a usual combinable enum

flags2: enum < {a, b}, c >; #this means that the enum may contain a combination of 'a' and 'b', or only 'c'

flags2 = $c;
flags2 += $a; #error: incompatible symbols!

flags2 = $a;  #OK

flags2 += $b; #OK

flags3: enum< {a, b}, {c, d} >; #for incompatible sets of flags...
Why do we need this? There are many cases where certain flags may be incompatible or just meaningless if being used together. Besides, it seems like a good solution for me to make the declaration of combinable enum more clear.
I have thought about similar things of what you suggest here, but in a slightly different way, my idea was to allow enums from different groups to be combinable, instead of those from the same group in your case. I came up that idea when I was thinking about flags for text alignment, and the form of the enum could be enum<top,center,bottom;left,center,right> , so that top and left are combinable, but top and bottom are not. It seems difficult to do this according the your proposal. Anyway, in the end I gave up this idea, because it would be complicated to implement, and it wouldn't be as useful as we think.

Regarding explicit number assignment, again it is required mainly for interfacing C/C++ libraries (in particular, Qt).
I may be totally wrong, but I see a very simple way of implementing such 'smart' combinable enums.
Having a brief look on the current implementation, I assume that multiple enum symbols are stored as different bits in value field of DEnum . So, it is possible to just reserve several bits for groups of symbols.
Suppose we want to support up to 3 groups, whose current number is stored in the higher bits of 32-bit value , and up to 24/group_count symbols in each group (the numbers are chosen for simplicity only). Suppose we have enum<a,b; c,d; e> (here and further I actually mean my view of such enums -- again just for simplicity). Then its value may look like:
  • 0x 2000 0000 0000 0000 - for an empty enum ( 2 means three groups are used);
  • 0x 2000 0000 0000 0001 - for $a ;
  • 0x 2000 0000 0000 0003 - for $a + $b ;
  • 0x 2000 0000 0001 0000 - for $c ;
  • 0x 2000 0000 0003 0000 - for $c + $d ;
  • 0x 2000 0001 0000 0000 - for $e ;
  • ...
Explicitly it would mean enum<a = 1, b = 2; c = 0x10000, d = 0x20000; e = 0x100000000> (assume that these IDs are provided automatically). Then checking whether a new (generally compatible) symbol is allowed to be included becomes simple: its bit just need to be in the same byte as all the bits of other symbols already being included.
I presume that during compilation enum symbol is replaced with its ID according to the enum it is being used with -- then the above seems like an acceptable approach to me. However, I may just have false notion about DEnum handling.
You are not wrong. Your type of combinable enum is much easier to implement than mine, it is easier to allow symbols from the same group to be combinable than allow symbols from different groups to be combinable. But I am not yet convinced by its usefulness.
Hmm, I don't understand why your combinable enums are more difficult to implement... You just would need to check that all the bits are stored in the different fragments of value instead. Or to assign the numbers as enum<a = 1, b = 0x10000, c = 2, d = 0x20000, e = 4> -- then the checking would be the same as with my enum variant. Your example of alignment enum would just need to look slightly different: enum<left, right; top, bottom; center> .
I just hope your numbers (3 groups) are just for illustration. If such a feature would be implemented, it should be without any limit.
Yeah, 3 groups were just for the simplicity of illustration; however, if I understand the current implementation of enums right, combinable enum rely on single dint value field to store all the symbols it holds. Therefore, in order to support unlimited number of symbols, the very basis of the enum type would need to be reimplemented in more complicated way.
I agree that such changes may be justified in favor of the completeness of the enum type... But assuming that one doesn't usually need dozens of flags for a single parameter, such thing could be just an overkill.
P.S. With my x64 version of DaoVM, I doubt I will ever face any problems regarding this limitation :)
There is no simple equivalence between your idea and mine. And enum<left, right; top, bottom; center> isn't exactly equivalent to what I was talking about, for example, $left+$top+$center is a valid combination for this form of enum type, but it make no sense for text alignment, which requires only a combination of two symbols for horizontal and vertical alignment.

This kind of grouped symbols in enum type doesn't seem to be very useful.
Alright, I have a possible solution to this issue as well. To support enum<left, right, center; top, bottom, center> , $center flag could be made a double-bit value. More precisely, enum symbol value could contain a separate bit per each group which has this symbol included. Thus we would have something like enum<left = 1, right = 0x100, top = 2, bottom = 0x200, center = 0x404> .
Then the verifying of symbol inclusion becomes the following: if the symbol has the bit corresponding to the active value fragment, this symbol is allowed to be included. Besides, when determining the active fragment, 'symmetric' bits should be excluded from the checking (but not if they are the only ones present!). An example to make the things clear:
  1. ...0000 00000001 L2 - $left
  2. ...0100 00000101 L2 - $left + $center
  3. ...0_0? 00000_01 L2 - checking whether $right can be included ('_' bits are not counted, so '?' bit is not within the current scope)
However, I'm afraid my approach has already become too sophisticated... :)

Change picture:

Choose file:
Visitor Map This site is powered by Dao
Copyright (C) 2009-2013, daovm.net.
Webmaster: admin at daovm dot net