Discussion:
VB6 use of For Each with Collections
(too old to reply)
s***@yahoo.com
2013-05-07 21:22:36 UTC
Permalink
Hi folks,

I am just trying to get my head around using the For Each ... to recurse through a custom Collection object. This is my first use of these tools.

I can't seem to see a way to derive the "KEY" value assigned to a member of the collection while using the For Each approach. The Default property of the collection object is the "ITEM" value.

Declared in a module:
Public colObject as New Collection

Form code:
Dim varValue As Variant
Dim strValue As String

'Build simple collection
colObject.Add "myItem01", "myKey01"
colObject.Add "myItem02", "myKey02"
colObject.Add "myItem03", "myKey03"
colObject.Add "myItem04", "myKey04"

For Each varValue In colObject
'This grabs the [Item] value because it is the default property.
strValue=CStr(varValue)
Next

I don't see a way to determine the "KEY" value. Use of the KEY value would help me compare one collection to another collection. My ultimate goal is to use the speed of "For Each" with collections to quickly compare two large textual datasets for discrepancies. I want to stay away from connections and dependencies to SQL databases/services, and very much prefer to have a simple 100 kb application perform such a task.

I realize I can enter the KEY value in BOTH the KEY & ITEM entries to get my end result, but it doubles the data size of the collection.

Any insight is appreciated.
Thanks.


~Steve
Michael Cole
2013-05-08 00:21:38 UTC
Permalink
Post by s***@yahoo.com
Hi folks,
I am just trying to get my head around using the For Each ... to recurse
through a custom Collection object. This is my first use of these tools.
You are not using a custom Collection object. You are using a
stock-standard out-of-the-box collection, which will contain variants.
Post by s***@yahoo.com
I can't seem to see a way to derive the "KEY" value assigned to a member of
the collection while using the For Each approach. The Default property of
the collection object is the "ITEM" value.
It's not the default value, its the only value. Key is not a property,
it's a way of retrieving the object. You cannot access it as a
property.
Post by s***@yahoo.com
Public colObject as New Collection
Dim varValue As Variant
Dim strValue As String
'Build simple collection
colObject.Add "myItem01", "myKey01"
colObject.Add "myItem02", "myKey02"
colObject.Add "myItem03", "myKey03"
colObject.Add "myItem04", "myKey04"
For Each varValue In colObject
'This grabs the [Item] value because it is the default property.
strValue=CStr(varValue)
No. It grabs the collection item, which happens to be a variant
containing a string. 'Cos a string is what you added to it.
Post by s***@yahoo.com
Next
--
Michael Cole
ralph
2013-05-08 00:39:06 UTC
Permalink
Post by s***@yahoo.com
I can't seem to see a way to derive the "KEY" value assigned to a member
of the collection while using the For Each approach. The Default property
of the collection object is the "ITEM" value.
/snipped
I don't see a way to determine the "KEY" value. Use of the KEY value would
help me compare one collection to another collection.
/snipped
I realize I can enter the KEY value in BOTH the KEY & ITEM entries to get my end result, but it doubles the data size of the collection.
As you discovered, you can't, as there is no "Key" property.

There are various work-arounds using additional arrays or collections,
and mechanisms to combine multiple properties within a single value.

It sounds like you have explored them.

[The Scripting Dictionary is another solution, but would require an
outside reference. Dictionaries do seem a tad faster than VB
Collections.]

The easiest way, and my personal preference, is to create your own
Collection and add a Key property, as per CVMichael's suggestion ...
http://www.vbforums.com/showthread.php?423549-RESOLVED-Displaying-the-quot-Key-quot-in-a-Collection

This has the additional advantage of allowing you to also provide
other useful "Properties" when designing specialized Collections.

However, as noted this will also store a "Key" twice.

If storage is such a problem, perhaps you might explore creating a
Class that manages Arrays and key mappings to mimic a VB Collection.

-ralph
Michael Cole
2013-05-08 01:48:25 UTC
Permalink
Post by ralph
Post by s***@yahoo.com
I can't seem to see a way to derive the "KEY" value assigned to a member
of the collection while using the For Each approach. The Default property
of the collection object is the "ITEM" value.
/snipped
I don't see a way to determine the "KEY" value. Use of the KEY value would
help me compare one collection to another collection.
/snipped
I realize I can enter the KEY value in BOTH the KEY & ITEM entries to get my
end result, but it doubles the data size of the collection.
As you discovered, you can't, as there is no "Key" property.
There are various work-arounds using additional arrays or collections,
and mechanisms to combine multiple properties within a single value.
It sounds like you have explored them.
[The Scripting Dictionary is another solution, but would require an
outside reference. Dictionaries do seem a tad faster than VB
Collections.]
The easiest way, and my personal preference, is to create your own
Collection and add a Key property, as per CVMichael's suggestion ...
http://www.vbforums.com/showthread.php?423549-RESOLVED-Displaying-the-quot-Key-quot-in-a-Collection
Another solution would be to use three standard collections - two for
holding the datasets, and one for holding the list of dataset field
names from both.

Then go through the field names collection in sequence, and for every
field name, compare the items in the two dataset collections.

I can explain this better if it is not clear.
--
Michael Cole
GS
2013-05-08 03:55:30 UTC
Permalink
Post by Michael Cole
Another solution would be to use three standard collections - two for
holding the datasets, and one for holding the list of dataset field
names from both.
Then go through the field names collection in sequence, and for every
field name, compare the items in the two dataset collections.
I can explain this better if it is not clear.
IMO, it would be easier, then, to use arrays for what you suggest. Why
bother with the extra code to create/loop collections?
--
Garry

Free usenet access at http://www.eternal-september.org
Classic VB Users Regroup!
comp.lang.basic.visual.misc
microsoft.public.vb.general.discussion
Deanna Earley
2013-05-08 08:19:51 UTC
Permalink
Post by ralph
The easiest way, and my personal preference, is to create your own
Collection and add a Key property, as per CVMichael's suggestion ...
http://www.vbforums.com/showthread.php?423549-RESOLVED-Displaying-the-quot-Key-quot-in-a-Collection
You don't necessarily require a custom collection class, but you can add
custom classes that have their "Key" property to a standard collection.
Post by ralph
However, as noted this will also store a "Key" twice.
Note that the key in the collection is normally hashed so only an extra
few bytes per entry is used.
--
Deanna Earley (***@icode.co.uk)
iCatcher Development Team
http://www.icode.co.uk/icatcher/

iCode Systems

(Replies direct to my email address will be ignored. Please reply to the
group.)
s***@yahoo.com
2013-05-08 15:27:07 UTC
Permalink
Deanna ~ thanks for that bit. I forgot that the Key value is actually ingested into a fixed length hash value. With that in mind I guess it serves my purpose to use my (textual data string) as both the Key & Item entries in the collection. If I absolutely need separate Key & Item values, I could use some simple approaches as pointed by by Ralph via his supplied link [ see below ]:

1. Dim MyCollection As New Collection
2. Dim sCollectionData(2)
3.
4. ''To add elements
5. sCollectionData(1) = sKey
6. sCollectionData(2) = sValue
7. MyCollection.Add sCollectionData, sKey
8.
9. ''To retrieve it
10. 'Retrieve by ordinal
11. sKey = MyCollection(AnyIndex)(1) ''Key retrieved
12. sValue = MyCollection(AnyIndex)(2) ''value retrieved
13.
14. 'retrieve by key
15. sKey = MyCollection(AnyKey)(1) ''Key retrieved
16. sValue = MyCollection(AnyKey)(2) ''value retrieved

I see methods of accomplishing this with 2 collections, and also as Michael suggested, with 3 collections ( whichever approach provides less time overhead )

Thanks everyone!

~ Steve
ralph
2013-05-08 16:25:43 UTC
Permalink
Post by s***@yahoo.com
I see methods of accomplishing this with 2 collections, and also as Michael
suggested, with 3 collections ( whichever approach provides less time overhead )
That last part ("less time overhead") can be problematic with a VB
Collection - period.

The VB Collection object is ancient and has never been upgraded. It
has always been, albeit damn convenient, robust, functional, and
simple, a slow performer. It is slower than the Scripting Dictionary
for example. If 'speed' is of primary importance then you are better
off to look elsewhere. Just Google for "VB Collection replacement".

[Here is but one site that might be of interest.
http://www.mvps.org/vbvision/collections.htm
]

-ralph
ralph
2013-05-08 16:52:37 UTC
Permalink
Post by s***@yahoo.com
I see methods of accomplishing this with 2 collections, and also as Michael
suggested, with 3 collections ( whichever approach provides less time overhead )
But before you get into too many 'performance' options and comparisons
you might want to digress a bit and take at look at this ...

Premature optimization is the root of all evil -- DonaldKnuth
http://c2.com/cgi/wiki?PrematureOptimization

Required reading for all programmers.

Like Bruce McKinney quoted - "It doesn't matter how fast it is if it
doesn't work." VB Collections *work*. <g>

-ralph
s***@yahoo.com
2013-05-08 19:00:19 UTC
Permalink
Thanks Ralph. I haven't had a chance yet to investigate the alternatives to VB collection replacement but I did quickly read thru the premature optimization reference. I found it falls in line with my own experiences and it is a tad humorous as well. My requirement might be considered "Mature Optimization", as I am upgrading my fairly old, well established application, that is still heavily in use today. I know exactly where the bottlenecks are :-) It is the size of the datasets, which continue to increase, which is crying out for a speed enchancement. I have used the [Dictionary Object] within the VBScript environment a number of times and this is where I thought the 'similar' underlying Linked-list / Hash table would benefit me with the VB Collection.

~ Steve
Eduardo
2013-07-01 01:48:19 UTC
Permalink
Post by Deanna Earley
Note that the key in the collection is normally hashed so only an extra
few bytes per entry is used.
This is strange. For speeding up the read access it's logical to use hashes,
but still, if the full information of the key is not stored somewhere (to
check "special cases"), it would be subject to collisions (to different keys
pointing to the same element).
ralph
2013-07-01 03:09:31 UTC
Permalink
Post by Eduardo
Post by Deanna Earley
Note that the key in the collection is normally hashed so only an extra
few bytes per entry is used.
This is strange. For speeding up the read access it's logical to use hashes,
but still, if the full information of the key is not stored somewhere (to
check "special cases"), it would be subject to collisions (to different keys
pointing to the same element).
Since an error is thrown with any attempt to add an item with an
existing key, and you can easily add multiple copies of the same
object to a collection - as long as each entry has a different key -
what are these special cases and possible collisions?

-ralph
Schmidt
2013-07-01 05:58:49 UTC
Permalink
Post by ralph
Post by Eduardo
Post by Deanna Earley
Note that the key in the collection is normally hashed so only an extra
few bytes per entry is used.
This is strange. For speeding up the read access it's logical to use hashes,
but still, if the full information of the key is not stored somewhere (to
check "special cases"), it would be subject to collisions (to different keys
pointing to the same element).
Since an error is thrown with any attempt to add an item with an
existing key, and you can easily add multiple copies of the same
object to a collection - as long as each entry has a different key -
what are these special cases and possible collisions?
I think, the second part in Eduardos statement:
"... pointing to the same element"
is a bit misleading, because what he meant was
probably:

"if the full information of the key is not stored somewhere,
and the calculated Hash-Values of different Keys collide
(come out the same - and point to the same HashIndex-Pos),
then there's no additional criterium, to separate those
same Entries in the HashTable for a successful Lookup".

Or something along those lines...

Hash-Values are like some sort of CRC32 (but with shorter
BitLength') - and so the collision-probability for String-
Keys (producing the same Hash-Index) is relatively high,
and mechanisms need to be in place, to do safe fallbacks
in these cases. The common (and easiest) approach is, to
just store under the same HashIndex also the colliding
entries, should they happen - and to keep them "unique" and
distinguishable there, we cannot only store the Item under
the given HashValue, but the String-Keys need to be stored
in a "lossless format" as well - to be able to perform the
(then of course smaller) fallback-loop (under the given HashIndex).


Olaf
Eduardo
2013-07-01 06:47:11 UTC
Permalink
Post by Schmidt
Post by ralph
Post by Eduardo
Post by Deanna Earley
Note that the key in the collection is normally hashed so only an extra
few bytes per entry is used.
This is strange. For speeding up the read access it's logical to use hashes,
but still, if the full information of the key is not stored somewhere (to
check "special cases"), it would be subject to collisions (to different keys
pointing to the same element).
Since an error is thrown with any attempt to add an item with an
existing key, and you can easily add multiple copies of the same
object to a collection - as long as each entry has a different key -
what are these special cases and possible collisions?
"... pointing to the same element"
is a bit misleading, because what he meant was
"if the full information of the key is not stored somewhere,
and the calculated Hash-Values of different Keys collide
(come out the same - and point to the same HashIndex-Pos),
then there's no additional criterium, to separate those
same Entries in the HashTable for a successful Lookup".
Or something along those lines...
Hash-Values are like some sort of CRC32 (but with shorter
BitLength') - and so the collision-probability for String-
Keys (producing the same Hash-Index) is relatively high,
and mechanisms need to be in place, to do safe fallbacks
in these cases. The common (and easiest) approach is, to
just store under the same HashIndex also the colliding
entries, should they happen - and to keep them "unique" and
distinguishable there, we cannot only store the Item under
the given HashValue, but the String-Keys need to be stored
in a "lossless format" as well - to be able to perform the
(then of course smaller) fallback-loop (under the given HashIndex).
Exactly Olaf, that's what I meant to say.

Here there is some reading that may be related to this issue:
http://epaperpress.com/vbhash/
(in the link "PDF format")
Eduardo
2013-07-01 07:08:40 UTC
Permalink
Post by ralph
Post by Eduardo
Post by Deanna Earley
Note that the key in the collection is normally hashed so only an extra
few bytes per entry is used.
This is strange. For speeding up the read access it's logical to use hashes,
but still, if the full information of the key is not stored somewhere (to
check "special cases"), it would be subject to collisions (to different keys
pointing to the same element).
Since an error is thrown with any attempt to add an item with an
existing key, and you can easily add multiple copies of the same
object to a collection - as long as each entry has a different key -
what are these special cases and possible collisions?
I don't know how VB handles keys, but my point is that to do it properly
(for all cases), it's not possible to do (as I understand it) without
storing the key in a lossless way somewhere. (see my other posts)
Eduardo
2013-07-01 07:01:48 UTC
Permalink
Post by Deanna Earley
Note that the key in the collection is normally hashed so only an extra
few bytes per entry is used.
(to different keys pointing to the same element).
I meant two or more different keys producing the same hash.

In the link https://en.wikipedia.org/wiki/Hash_function there is an image
illustrating this (the first figure at the top right)

So, to be sure that two "input" keys are the same, it's not enough to
compare it's hashes, because two different keys can produce the same hash,
so the key must be stored, perhaps not exactly as it is but compressed in a
lossless way.
Vincent Belaïche
2013-07-05 04:30:46 UTC
Permalink
Post by Eduardo
Post by Deanna Earley
Note that the key in the collection is normally hashed so only an extra
few bytes per entry is used.
(to different keys pointing to the same element).
I meant two or more different keys producing the same hash.
In the link https://en.wikipedia.org/wiki/Hash_function there is an image
illustrating this (the first figure at the top right)
So, to be sure that two "input" keys are the same, it's not enough to
compare it's hashes, because two different keys can produce the same hash,
so the key must be stored, perhaps not exactly as it is but compressed in a
lossless way.
Just one question: isn't the root cause why you cannot get the key that
you do some "Let" assignment to a Variant object like this:

Dim vElement As Variant, oCollection As Collection
....
Let vElement = oCollection.Item(1)

and not a Set assignment to an Object object like this:

Dim oElement As Object, oCollection As Collection
....
Set oElement = oCollection.Item(1)

I must say that I cannot check the code above because I have not any VB
other than VBScript installed on my WindowsXP machine --- can you
install some VB6 for free ?

I am meaning that when you do the let assignment there is some type cast
which makes you loose some of the information in the indiced object. I
do not know how Collections are implemented, but I fully share what
Eduardo has written: the key needs to be stored somewhere in a lossless
way for a hash table to work.

BR,
Vincent.
Eduardo
2013-07-05 06:31:03 UTC
Permalink
Post by Vincent Belaïche
Post by Eduardo
Post by Deanna Earley
Note that the key in the collection is normally hashed so only an extra
few bytes per entry is used.
(to different keys pointing to the same element).
I meant two or more different keys producing the same hash.
In the link https://en.wikipedia.org/wiki/Hash_function there is an image
illustrating this (the first figure at the top right)
So, to be sure that two "input" keys are the same, it's not enough to
compare it's hashes, because two different keys can produce the same hash,
so the key must be stored, perhaps not exactly as it is but compressed in a
lossless way.
Just one question: isn't the root cause why you cannot get the key that
I'm not having any problem, I just commented about something that Deanna
said.

Who was having a problem (a month ago or so) was the one who started the
thread.

The problem was that it's not possible to retrieve the key in a VB standard
collection.
Post by Vincent Belaïche
Dim vElement As Variant, oCollection As Collection
....
Let vElement = oCollection.Item(1)
Dim oElement As Object, oCollection As Collection
....
Set oElement = oCollection.Item(1)
I must say that I cannot check the code above because I have not any VB
other than VBScript installed on my WindowsXP machine --- can you
install some VB6 for free ?
VB6 is not sold any more. It was not free when it was sold (more than 10
years ago).
Post by Vincent Belaïche
I am meaning that when you do the let assignment there is some type cast
which makes you loose some of the information in the indiced object. I
do not know how Collections are implemented, but I fully share what
Eduardo has written: the key needs to be stored somewhere in a lossless
way for a hash table to work.
BR,
Vincent.
Loading...