Hash table

Oforth examples. Feel free to post your own code.

Hash table

Postby Doug » 09 Aug 2018 20:51

Code: Select all
\ Key collisions are handled via separate chaining.
\ An attempt to #add with an existing key will simply replace the value.


Object Class new: Node ( key mutable val mutable next )
  m: initialize \ val key --
     := key := val 0 := next ;
  m: get-key @key ; \ -- s
  m: get-val @val ; \ -- n
  m: !val := val ; \ n --
  m: get-next @next ; \ -- next-node
  m: !next := next ; \ next-node --
  m: << @key << "," << @val << " " << @next << "  " << ;


: hash-str ( s -- hash )
  | i |
  0 s size loop: i [ i s at i 1+ * + ] ;

: */ ( n1 n2 n3 -- result )
  n1 n2 * n3 / ;


Object Class new: Hash-table ( mutable table mutable no-of-nodes tload mutable a )

m: initialize  \ n --
    0 Array newWith := table 0 := no-of-nodes 100 := tload ;

\ given a key return the table index and contents of the related bucket ( a Node or Integer 0 )
m: buckets-table \ ( k -- idx node | idx 0 )
  hash-str @table size mod abs 1+ dup @table at ;


\ 3 possible outcomes for search method:
\ 1) not found (empty bucket), return index/1/false
\ 2) not found (but there was a hash-collision, return last node in chain  node/0/false
\ 3) found, return node/true
m: search ( k -- node true | idx 1 false | node 0 false )
    k self buckets-table
    \ idx node-obj or idx 0
     dup 0 = if  drop 1 false return else nip then \ => idx 1 false
     \ not 0, so must be a node
     \ if we got here, then it's a bucket collision
     \ node
     begin
       dup get-key k == if true return  then \ => node true
       dup get-next \ will get a chained node or 0
       \ node1 node2 or node1 0
       dup 0 = if 0 return then \ => node1 0 false
       nip \ node2
     again ;

m: inc-nodes \ -1 or +1
  @no-of-nodes + := no-of-nodes ;

m: remove ( k -- true | false )
    | idx last-node |
    k self buckets-table swap -> idx
    \ node-obj or 0
     dup 0 = if return then \ => false
     dup get-key k == if get-next idx swap @table put -1 self inc-nodes true return then
     \ not found yet, possibly key is in chained nodes
     begin
       dup -> last-node
       get-next
       dup 0 = if return then \ => false
       dup get-key k == if get-next last-node !next -1 self inc-nodes true return  then
     again ;   

 m: get \ ( k -- val true | false )
   self search
   if \ node
      get-val true
   else \ idx 1 or node 0
      2drop false
   then ;

m: nodes-to-a \ node
     begin
       dup @a add
       get-next \ will get a chained node or 0
       \ node2 or 0
       dup 0 = if drop return then
       \ node2
     again ;

\ Array a is used as temporary storage for all nodes while a new
\ table is being constructed
m: expand-table \ -- true or false
  @no-of-nodes @table size @tload 100 */ >
  if Array new := a
     #[ dup class #Node = if self nodes-to-a else drop then ] @table apply
     true return
   then false ;

m: size \ ( -- n ) \ not used here, but could be useful
  @no-of-nodes ;

\ note that value can be anything, key must be a string object
m: add ( v k -- )
    k self search
    if \ node
       \ key already present so just over-write val
        v swap !val
    else \ idx 1 or node 0
          v k Node new 1 self inc-nodes  \  idx 1 nnode or  node 0 nnode
          swap \ idx newnode 1 or node newnode 0
          if \ "node to empty table slot" .
             @table put
          else \ "chain node due to hash collision" .
             swap !next
          then
          \ now expand the table if necessary
          self expand-table
          if @table size 2 * 1+ self initialize
             #[ dup get-val swap get-key self add ] @a apply
             null := a \ will force a garbage collect of a?
          then
    then ;


m: << \ display each node
  @no-of-nodes << " nodes total  " << @table size << " total buckets table size" <<
  #[ dup if printcr << else drop then ] @table apply ;


tvar: h

\ Needs an initial size at instantiation. I arbitrarily chose 10.
\ But the table will automatically resize itself. I assume the table size is limited only by RAM.
10 Hash-table new to h

 15 "pig" h add
 88 "dog" h add
 99 "cat" h add
 66 "goat" h add
 33 "rat" h add
 2 "frog" h add
 55 "fly" h add
 67 "gnat" h add
 9 "hat" h add
 10 "jet" h add
 12 "cart" h add
 18 "hole" h add
 18 "hole2" h add
 18 "hole3" h add
 18 "hole4" h add
 18 "hole5" h add
 18 "hole6" h add
 18 "hole7" h add
 18 "hole8" h add
 15 "pig2" h add
 15 "pig3" h add
 15 "pig4" h add
 15 "pig5" h add
 15 "pig6" h add
 15 "pig7" h add
 15 "pig8" h add
 15 "pig9" h add
 15 "pig10" h add

h .

Doug
 
Posts: 23
Joined: 20 Jul 2018 14:26

Re: Hash table

Postby Franck » 11 Aug 2018 13:24

Hello Doug,

Great code !

There is a Hash class into the "collect" package, but it is less powerfull than yours : your version has an automatic reallocation feature.

Just two comments :
- There is built-in #hashValue method that returns a hash for an object (it works for all kind of objects, not only string).
- You could declare a #forEachNext method for your Hash-table class. If so, you could use all HOF directly on hash-tables objects (#apply, ...).

Your code is quite good. You seem to be quite confortable with advanced Oforth mecanisms, no ? (blocks, gc, ...).

Regards,
Franck
Franck
 
Posts: 161
Joined: 29 Oct 2014 19:01

Re: Hash table

Postby Doug » 11 Aug 2018 21:47

Hi Franck,

I suspected that there might be a hash table already in Oforth. But I wanted to do it as an exercise and to continue learning the Oforth language. I will look at Oforth's hash table now (I haven't seen it yet).

You are right. I am beginning to get very comfortable with Oforth and its features. The more I use Oforth the more I like it. Standard Forth is fine, but I don't do embedded programming and it is so very nice to have many object types built in to the language (strings, arrays, etc) along with a rich pre-made functionality. But it still "feels" very much like Forth, which is surprising me a bit. I don't miss DO LOOPs at all though.

Thanks for the comments.

-Doug
Doug
 
Posts: 23
Joined: 20 Jul 2018 14:26

Re: Hash table

Postby Doug » 12 Aug 2018 11:59

Franck,

Have now read your code for a hash table. Wow. That is concise! It will take me awhile to digest it.

It occurs to me that one more reason, of the many commonly known, for using OOP is the ability to create the HOFs as you have done in Oforth. The HOFs are very powerful indeed and deserve study to be used. Without polymorphism these HOFs would be difficult to create and use, I believe. I have heard it said that OOP is only useful for creating GUIs. Of course this notion is incorrect. I have been trying to push for a standardized OOP in Forth for years. But the resistance to that has been incredible and I long ago gave up.

One other comment about Oforth: It appears to me that Oforth has several more built in "recognizers" compared to standard Forth. Strings, arrays, @xxx for instance variables, ->xxx for locals, JSON objects, and likely a few more. I find these recognizers make programming in Oforth easier and more readable.

I really like what you have done with Oforth. Thanks again.

-Doug
Doug
 
Posts: 23
Joined: 20 Jul 2018 14:26

Re: Hash table

Postby Franck » 13 Aug 2018 15:08

Thank you Doug,

Yes, the HOF, combined with blocks (or quotations) are very powerfull tools.

The advantage of polymorphism is that, when you define the #forEachNext method for a user defined class, you have access to all already defined HOF.
You don't have to create all HOF for each collection.

Oforth does not implement full recognizers (and I'am not quite decided yet to the usefulness of recognizers as they are discussed in the Forth community).

Oforth just implements prefixes and suffixes :
- when a character is tagged as "prefix" and it is the first character read, the interpreter will use this character as a word and stop to read further.
- When a character is tagged as "suffix", the interpreter will stop the reading of a token when it encounters this character (the character will be the first of the next reading).

This allows to recognize many patterns (arrays, json, ...).

Here are the characters concerned :

Code: Select all
   InputChar[0]    =           CSUFFIX ;
   InputChar['\n'] =           CSUFFIX ;
   InputChar['\r'] =           CSUFFIX ;
   InputChar['\t'] =           CSUFFIX ;
   InputChar[' ']  =           CSUFFIX ;
   InputChar['{']  = CPREFIX | CSUFFIX ;
   InputChar['}']  = CPREFIX | CSUFFIX ;
   InputChar['[']  = CPREFIX | CSUFFIX ;
   InputChar[']']  = CPREFIX | CSUFFIX ;
   InputChar[',']  = CPREFIX | CSUFFIX ;
   InputChar['(']  = CPREFIX | CSUFFIX ;
   InputChar[')']  = CPREFIX | CSUFFIX ;
   InputChar['|']  = CPREFIX | CSUFFIX ;
   InputChar['#']  = CPREFIX ;
   InputChar['\"'] = CPREFIX ;
   InputChar['\''] = CPREFIX ;
   InputChar['$']  = CPREFIX ;
   InputChar['@']  = CPREFIX ;
   InputChar[';']  = CPREFIX ;
   InputChar['\\'] = CPREFIX ;


Thanks again for your returns.

Franck
Franck
 
Posts: 161
Joined: 29 Oct 2014 19:01

Re: Hash table

Postby Doug » 13 Aug 2018 15:15

Thanks for the useful reply, Franck.

I can see that studying Oforth's already defined HOFs is something I need to do. It is a powerful way to program.

Also, I agree that recognizers as such are not a big issue. The way you have Oforth set up with prefixes and suffixes is fine.

Regards,

-Doug
Doug
 
Posts: 23
Joined: 20 Jul 2018 14:26

Re: Improved Hash table

Postby Doug » 26 Aug 2018 22:20

Improved version. See notes at beginning of code.
See example use at end of code.
Oforth's HOFs are a clever and powerful way to program.


Code: Select all

\ #forEachNext method defined as per Franck's suggestion.
\  Doing so then makes many useful HOFs available for free!  Very nice!
\ See a few HOF examples at the end of the file.

\ replaced my hash funtion with Oforth's built-in function #hashValue

\ Removed some un-needed word names at end of program using #forget:


Object Class new: Node ( key mutable val mutable next )
  m: initialize \ val key --
     := key := val 0 := next ;
  m: get-key @key ; \ -- s
  m: get-val @val ; \ -- n
  m: !val := val ; \ n --
  m: get-next @next ; \ -- next-node
  m: !next := next ; \ next-node --
  m: << @key << "," << @val << ;

: */ ( n1 n2 n3 -- result )
  n1 n2 * n3 / ;

Object Class new: Hash-table ( mutable table mutable no-of-nodes tload mutable a  )

m: initialize  \ n --
    0 Array newWith := table 0 := no-of-nodes 100 := tload ;


m: doHash \ str-obj -- hash
    hashValue @table size mod abs 1+ ;


\ given a key return the table index and contents of the related bucket ( a Node or Integer 0 )
m: buckets-table \ ( k -- idx node | idx 0 )
\  hash-str @table size mod abs 1+ dup @table at ;
  self doHash dup @table at ;


\ 3 possible outcomes for search method:
\ 1) not found (empty bucket), return index/1/false
\ 2) not found (but there was a hash-collision, return last node in chain  node/0/false
\ 3) found, return node/true
m: search ( k -- node true | idx 1 false | node 0 false )
    k self buckets-table
    \ idx node-obj or idx 0
     dup 0 = if  drop 1 false return else nip then \ => idx 1 false
     \ not 0, so must be a node
     \ if we got here, then it's a bucket collision
     \ node
     begin
       dup get-key k == if true return  then \ => node true
       dup get-next \ will get a chained node or 0
       \ node1 node2 or node1 0
       dup 0 = if 0 return then \ => node1 0 false
       nip \ node2
     again ;

m: inc-nodes \ -1 or +1
  @no-of-nodes + := no-of-nodes ;


m: remove ( k -- true | false )
    | idx last-node |
    k self buckets-table swap -> idx
    \ node-obj or 0
     dup 0 = if return then \ => false
     dup get-key k == if get-next idx swap @table put -1 self inc-nodes true return then
     \ not found yet, possibly key is in chained nodes
     begin
       dup -> last-node
       get-next
       dup 0 = if return then \ => false
       dup get-key k == if get-next last-node !next -1 self inc-nodes true return  then
     again ;   

 m: get \ ( k -- val true | false )
   self search
   if \ node
      get-val true
   else \ idx 1 or node 0
      2drop false
   then ;

m: nodes-to-a \ node
     begin
       dup @a add
       get-next \ will get a chained node or 0
       \ node2 or 0
       dup 0 = if drop return then
       \ node2
     again ;

\ Array a is used as temporary storage for all nodes while a new
\ table is being constructed
m: expand-table \ -- true or false
  @no-of-nodes @table size @tload 100 */ >
  if Array new := a
     #[ dup class #Node = if self nodes-to-a else drop then ] @table apply
     true return
   then false ;

m: size \ ( -- n ) \ not used here, but could be useful
  @no-of-nodes ;

\ note that value can be anything, key must be a string object
m: add ( v k -- )
    k self search
    if \ node
       \ key already present so just over-write val
        v swap !val
    else \ idx 1 or node 0
          v k Node new 1 self inc-nodes  \  idx 1 nnode or  node 0 nnode
          swap \ idx newnode 1 or node newnode 0
          if \ "node to empty table slot" .
             @table put
          else \ "chain node due to hash collision" .
             swap !next
          then
          \ now expand the table if necessary
          self expand-table
          if @table size 2 * 1+ self initialize
             #[ dup get-val swap get-key self add ] @a apply
             null := a \ will force a garbage collect of a?
          then
    then ;

\ note that by defining #forEachNext we then get #apply #applyIf and many HOFs for free!
 m: forEachNext  \  x -- y next-node true | false    \ x and y are nodes
   | idx i |
   \ first time through must get a hit on a table slot
   dup null? if drop  @table size loop: i [ i @table at dup if dup true return else drop then ] then
   \ subsequent times through
   dup get-key self doHash -> idx
   \ node
   get-next dup
   if dup true return  \ found a linked node
   else
   \ 0
   \ we have a 0, must find non-empty bucket in table
   drop
   idx 1+ @table size for: i [ i @table at dup if dup true return else drop then ] \ hit on a table node
   then
   false ; \ if we got here, then no more nodes in table

 m: << \ display each node
   @no-of-nodes << " nodes total  " << @table size << " total buckets table size" <<
   #[ printcr dup get-key . get-val . ] self apply ;


\ prune word names no longer needed
forget: doHash
forget: buckets-table
forget: inc-nodes
forget: nodes-to-a
forget: expand-table

tvar: h

\ initial table size of 10 buckets
\ It will grow as needed
10 Hash-table new to h

 15 "pig" h add
 88 "dog" h add
 99 "cat" h add
 66 "goat" h add
 33 "rat" h add
 2 "frog" h add
 55 "fly" h add
 67 "gnat" h add
 9 "hat" h add
 10 "jet" h add
 12 "cart" h add
 18 "hole" h add
 18 "hole2" h add
 18 "hole3" h add
 18 "hole4" h add
 18 "hole5" h add
 18 "hole6" h add
 18 "hole7" h add
 18 "hole8" h add
 15 "pig2" h add
 15 "pig3" h add
 15 "pig4" h add
 15 "pig5" h add
 15 "pig6" h add
 15 "pig7" h add
 15 "pig8" h add
 15 "pig9" h add
 15 "pig10" h add
 66 "goat1" h add
 66 "goat2" h add
 66 "goat3" h add
 66 "goat4" h add
 66 "goat5" h add
 66 "goat6" h add
 66 "goat7" h add
 66 "goat8" h add
 66 "goat9" h add
 66 "goat10" h add
 66 "goat11" h add
 66 "goat12" h add
 66 "goata" h add
 66 "goatq" h add
 66 "goatw" h add
 66 "goate" h add
 66 "goatr" h add
 66 "goatt" h add
 66 "goaty" h add
 66 "goatu" h add
 66 "goati" h add
 66 "goato" h add
 66 "goatp" h add
 66 "goats" h add
 66 "goatd" h add
 66 "goatf" h add
 66 "goatg" h add
 66 "goath" h add
 66 "goatj" h add
 66 "goatk" h add
 
"Print the Hash-table using . " . printcr
h .

printcr
printcr
"Print the node with the highest val using #maxFor" . printcr
#get-val h maxFor .


66666 "goat" h add  \ will over-write goat val

printcr
"remove node with key=jet using #remove" . printcr
"jet" h remove drop

printcr
"Add 1 to each val using #apply" . printcr
#[ dup get-val 1+ swap !val ] h apply

printcr
printcr
"Print the Hash-table using . " . printcr
h .

printcr
printcr
"Print only those nodes having val=67 using #applyIf " . printcr
#[ get-val 67 = ] #[ printcr . ] h applyIf

printcr
printcr
"Print the node with the highest val using #maxFor" . printcr
#get-val h maxFor .


Doug
 
Posts: 23
Joined: 20 Jul 2018 14:26


Return to Oforth examples

Who is online

Users browsing this forum: No registered users and 1 guest

cron