UTF8 - To Malloc or not to Malloc - that is another question

Published 10 MAR 2010 at 10:03:00AM by Captain C

In our recent post on using memory pre-allocation when building large strings commenter M@ pointed out quite correctly that using the normal [] operators while in UTF8 mode results in a severe performance hit due to the necessity of calculating the character position of the insertion point during each iteration.

A workaround that was suggested was to temporarily switch to ANSI mode for the [] operation and then switch back afterwards. This is a valid solution and one we've used ourselves before, but it does create a possible failure point: If your system hits a fatal debug condition before you switch back you might unknowingly be stuck in ANSI mode which could result in subsequent data corruption.

A safer alternative to this is to use the PutBinaryValue function that we documented here - this ignores any string-encoding and does a straightforward binary copy to the specified offset.

Here's the Preallocation sample program from the previous post updated with the binary functions:

Subroutine ZZ_SpeedTest( Void )

   Declare Function TimeGetTime

   startTime    = TimeGetTime()    stringLength = GetByteSize( @Upper.Case : @Fm )    totalLength  = stringLength * 99999    newArray     = Space(totalLength)    arrayPtr     = 1

   For loopPtr = 1 To 99999       PutBinaryValue( newArray, arrayPtr, CHAR, @Upper.Case : @Fm )       arrayPtr += stringLength    Next

   endTime   = TimeGetTime()    totalTime = endTime - startTime

   Call Msg(@Window, "Total time was " : totalTime)

Return

This option took 95 milliseconds in UTF8 mode in our testing. Pretty much on a par with the [] operator in ANSI mode (As a aside the [] operator in UTF8 mode took……. well we don't know actually - we gave up after 10 minutes of waiting for it to finish!)

We also tested the concatenation (:=) option in UTF8 mode - this slowed down the program by half - better than the [] operators but still not great.

Comments

Original ID: post-8495988801444813561
  • third_party_content/sprezz_blog/15410.41875.txt
  • Last modified: 2024/01/17 19:45
  • by 127.0.0.1