url encoding of ampersand (OpenInsight Specific)
At 20 APR 2001 09:50:55AM Dave Harmacek dave@harmacek.com wrote:
In Web Deployment I'm wondering how you have solved the problem of your user entry containing the ampersand, &, character, it is truncated. Inet_QueryParam also doesn't parse correctly.
e.g. User enters (without the quotes) "The quick & dirty fox jumped over the brown dog" OI Inet_QueryParam returns "The quick ".
I know about url encoding of & to %38. I expected the browser's Submit to encode it for me.
tia, Dave
At 20 APR 2001 10:43AM Oystein Reigem wrote:
Dave,
I've been through this at one time, but don't remember much. But revisiting the app in question it seems I did the following:
(1) In the web pages containing forms use the JavaScript encode() function to encode all difficult characters before they are submitted to the server
(2) In the Inet procedures preparing the html page to be returned to the client run all text (not codes) through the opposite conversion.
More to follow…
- Oystein -
At 20 APR 2001 11:33AM Oystein Reigem wrote:
Dave,
(2)
Here's some code to use at startup that builds a conversion table (array) from ANSI text to html.
I had the array in common.
Seems I've included the OI delimiters. Don't remember why.
Simple version:
HTML_Conv$="
for I=0 to 255
HTML_Conv$=char(I)
next I
HTML_Conv$=""
HTML_Conv$=<"
HTML_Conv$") + 1]=>"
HTML_Conv$=&"
Advanced version - the above plus the following:
/* 7 bit chars */
HTML_Conv_Pro$="
for I=0 to 127
HTML_Conv_Pro$=HTML_Conv$
next I
/* 8 bit chars coded as numbers */
for I=128 to 255
HTML_Conv_Pro$=&#" : I : ";"
next I
/* more readable entities for many of the 8 bit chars */
HTML_Conv_Pro$=æ"
HTML_Conv_Pro$=ø"
HTML_Conv_Pro$=å"
HTML_Conv_Pro$=Æ"
HTML_Conv_Pro$=Ø"
HTML_Conv_Pro$=Å"
HTML_Conv_Pro$=à"
HTML_Conv_Pro$=á"
HTML_Conv_Pro$=â"
HTML_Conv_Pro$=ã"
HTML_Conv_Pro$=ä"
HTML_Conv_Pro$=ç"
HTML_Conv_Pro$=è"
HTML_Conv_Pro$=é"
HTML_Conv_Pro$=ê"
HTML_Conv_Pro$=ë"
HTML_Conv_Pro$=ì"
HTML_Conv_Pro$=í"
HTML_Conv_Pro$=î"
HTML_Conv_Pro$=ï"
HTML_Conv_Pro$=ð"
HTML_Conv_Pro$=ñ"
HTML_Conv_Pro$=ò"
HTML_Conv_Pro$=ó"
HTML_Conv_Pro$=ô"
HTML_Conv_Pro$=õ"
HTML_Conv_Pro$=ö"
HTML_Conv_Pro$=ù"
HTML_Conv_Pro$=À"
HTML_Conv_Pro$=Á"
HTML_Conv_Pro$=Â"
HTML_Conv_Pro$=Ã"
HTML_Conv_Pro$=Ä"
HTML_Conv_Pro$=Ç"
HTML_Conv_Pro$=È"
HTML_Conv_Pro$=É"
HTML_Conv_Pro$=Ê"
HTML_Conv_Pro$=Ë"
HTML_Conv_Pro$=Ì"
HTML_Conv_Pro$=Í"
HTML_Conv_Pro$=Î"
HTML_Conv_Pro$=Ï"
HTML_Conv_Pro$=Ð"
HTML_Conv_Pro$=Ñ"
HTML_Conv_Pro$=Ò"
HTML_Conv_Pro$=Ó"
HTML_Conv_Pro$=Ô"
HTML_Conv_Pro$=Õ"
HTML_Conv_Pro$=Ö"
HTML_Conv_Pro$=Ù"
HTML_Conv_Pro$=ú"
HTML_Conv_Pro$=û"
HTML_Conv_Pro$=ü"
HTML_Conv_Pro$=ý"
HTML_Conv_Pro$=þ"
HTML_Conv_Pro$=ÿ"
HTML_Conv_Pro$=Ú"
HTML_Conv_Pro$=Û"
HTML_Conv_Pro$=Ü"
HTML_Conv_Pro$=Ý"
HTML_Conv_Pro$=Þ"
HTML_Conv_Pro$=Ÿ"
HTML_Conv_Pro$=ß"
Now conversion can be done with
New_Char=HTML_Conv$
or
New_Char=HTML_Conv_Pro$
- Oystein -
At 20 APR 2001 11:51AM Oystein Reigem wrote:
[notag]Dave,
(1)
What I had was an image database with a query form.
That form had several searchable fields.
Each field had its own submit button. It was only possible to search one field at a time.
In addition I had a hidden field where I put the converted value.
I also had a hidden field where I put the name of the field to search.
Both these parameters had to be returned to my OI app.
Each field was defined like this:
<INPUT TYPE=TEXT NAME=MOTIV VALUE="">
(This was the "motiv" field, i.e, a field for a subject/content description of an image.)
The field for the converted value was defined like this:
<INPUT TYPE=HIDDEN NAME="$INPUTVALUE" VALUE="">
The field for the name of the field to search was defined like this:
<INPUT TYPE=HIDDEN NAME="$FIELDNAME" VALUE="">
For each submit button I used an image:
<IMG NAME="MOTIVBUTTON" BORDER=0 SRC="/xxx/xxx/xxx.jpg" ALT="xxx xxx xxx">
To make it work as a link I had to surround it with an <A> code pair:
<A HREF="javascript:query('MOTIV',document.Form0.MOTIV.value)">
<IMG NAME="MOTIVBUTTON" BORDER=0 SRC="/xxx/xxx/Motiv.jpg" ALT="xxx xxx xxx">
</A>
As you can see what happens when the user clicks the button (image) is that a JavaScript function query() runs. It's this function that does the actual submitting. I wrote that query() function, and I've shown it below. Its first parameter is the name of the field to search. The second one is the search value to be converted. As you can see I get the value by referring to the value of the MOTIV object (field) of the first (0th) form of the current document (document.Form0.MOTIV.value).
Here is the query() function:
<SCRIPT>
function query(FNV,IV) {with (document.Form0) {$FIELDNAME.value=FNV;$INPUTVALUE.value=escape(IV);submit();}}</SCRIPT>
It really doesn't do much. It takes two values and put them into the two hidden fields. One of the values is converted with the escape() function.
Now at the server end (my OI app) I got the name of the field to search from the $FIELDNAME query parameter, and the value to search for from the $INPUTVALUE parameter.
This example was a bit more complicated than necessary, but I don't dare simplify since I'm a bit rusty on the subject.
- Oystein -[/notag]
At 25 APR 2001 08:45AM Dave Harmacek wrote:
Thanks for your response. I found that if I put the encoded result into a separate and "hidden" field, the original data field is still sent during the submit and screws up the parsing.
At 25 APR 2001 01:37PM Oystein Reigem wrote:
Dave,
The thought occurred to me after I posted. But then I forgot the whole thing.
I must have the same problem in my app then. Or could it be something with the order of fields? Perhaps if the problematic data occur late enough in the query parameters - after all the parameters you actually use - it's all right?
I'll see if I can do some experiments tomorrow - if not for you at least for my own sake.
I believe Sprezzatura had a solution. Perhaps they just ran the conversion on the original fields.
- Oystein -
At 26 APR 2001 05:52AM Oystein Reigem wrote:
Dave,
About screwing up the parsing
Let's say you have an html form with fields NAME and AGE.
If the user fills in Jones and 33 the query parameters become NAME=Jones&AGE=33, which get parsed OK.
On the other hand if the user fills in Smith&Jones and 33 the query parameters become NAME=Smith&Jones&AGE=33, which get parsed as three parameters NAME=Smith, Jones= and AGE=33. The way I handle this in my app is I always tell the user how the app interpreted her query. So my app would specifically tell the user it had done a query on NAME=Smith. Also none of my query fields contain values with special characters like &, so there's not much damage done when then parameter gets truncated. Also note that the bogus parameter Jones= does no harm in itself, and that AGE=33 is also unharmed.
A user with sufficient knowledge could of course botch up the query by filling in values like DAY&AGE in the NAME field, which would produce a bogus AGE=. Since this bogus AGE= preceeds the proper AGE=33 parameter it will be used by the app instead of the proper one.
Another example of this: In the query forms for my app I have a hardcoded hidden field $TABLENAME that specifies which table to search. The $TABLENAME field is among the last fields of the form, preceeded by all the query fields. So if a knowledgeable user fills in the value something&$TABLENAME=BOLLOCKS in a query field, that will produce a bogus $TABLENAME=BOLLOCKS parameter, causing my app to try and search a non-existing BOLLOCKS table.
One question: Can you see any case of non-local damage to the query parameters? I admit there are problems with my approach, but at least the effect of an ampersand in a field doesn't propagate to all the following parameters.
UnEscape()
For completeness - I forgot to tell that in the app I convert the query parameters back with the opposite conversion of the form's escape() with the following UnEscape function.
Each query parameter should be UnEscape'd separately, of course, not to reintroduce the ampersand problem.
<code> function UnEscape( String ) Num=len(String) NewString=" I=1 loop while I Ch=StringI, 1 if Ch=%" then if I + 2 ] Num then /* too few characters following "%" */ /* abort conversion */ I=Num + 1 end else H1=StringI+1, 1 convert "abcdef" to "ABCDEF" in H1 H2=StringI+2, 1 convert "abcdef" to "ABCDEF" in H2 if index("0123456789ABCDEF", H1, 1) and index("0123456789ABCDEF", H2, 1) then NewString := Char( IConv( H1 : H2, "MX" ) ) I += 3 end else /* characters following "%" not legal hex digits */ /* abort conversion */ I=Num + 1 end end end else NewString := Ch I += 1 end repeat return NewString</code>
- Oystein -
At 26 APR 2001 06:35AM Oystein Reigem wrote:
Dave,
One more thing. There might be an error in the OI Inet_QueryParam function. At least there's an error in OI 3.61's version of this function. It doesn't handle encoded data properly. If a query parameter contains a properly encoded ampersand (%26), it is treated as a real ampersand, corrupting the query parameters. So the encoding you do in the form is of no help at all. Bummer!
The solution (if it isn't solved in newer versions of OI) is to write one's own Inet_QueryParam function.
I do remember Sprezzatura posting something about this at the time I did my development but it never registered.
A kludgier and not watertight solution would be to use one's own non-standard encoding.
I think I should thank you for stirring this up again. As I said earlier I don't think the problem matters much to my users but it's good to get things right.
![]()
- Oystein -
At 26 APR 2001 12:02PM [url=http://www.sprezzatura.com]The Sprezzatura Group[/url] wrote:
For obvious reasons it has been a year or so since we experienced this problem with OICGI but when we used OICGI we got around the problem by encoding the ampersand to our own string using javaScript at the front end and decoding it in R/Basic at the back end as you have indicated.
World Leaders in all things RevSoft
At 27 APR 2001 04:23AM Oystein Reigem wrote:
Sprezzatura,
…when we used OICGI we got around the problem by encoding the ampersand to our own string using javaScript at the front end and decoding it in R/Basic at the back end as you have indicated.
Was it like this?: The user clicks the submit button, which runs a JavaScript function. That function loops through all fields and encodes their values (i.e, exchanges all ampersands with a certain string) and then does a submit().
If so: What happens if the user later goes back in history to that page? What will she see? The original values (ampersands) or the encoded values? I believe the latter. Dodgy.
- Oystein -
At 27 APR 2001 08:10AM [url=http://www.sprezzatura.com]The Sprezzatura Group[/url] wrote:
Oystein,
Yes very dodgy . One workaround would be creating a hidden form and input tags and populating and submitting those perhaps? Or if the data is not large just set the document's location property with an URL you construct yourself containing the swapped out string (ie do a GET instead of a POST).
World leaders in all things RevSoft
At 27 APR 2001 10:23AM Oystein Reigem wrote:
Sprezzatura,
Well, here's my latest dodgy idea:
Let's say your form has the fields NAME and NUMBER. A user might e.g fill in these fields with values Smith & Jones and 33.
Now let the form have one, additional first field - a hidden one. Call it ALL_ENCODED or something. Just before submitting, loop through all fields except ALL_ENCODED. For each field encode its value. Concatenate field names and encoded values into a proper query parameters string, with an extra ampersand at the start. Put that string into ALL_ENCODED.
With the user data above this string might become
&NAME=Smith%202620%Jones&NUMBER=33
(depending on how heavily one does one's encoding), yielding the following query parameters:
ALL_ENCODED=&NAME=Smith%202620%Jones&AGE=33&NAME=Smith & Jones&NUMBER=33
Result:
- One empty, disposable ALL_ENCODED query parameter
- One properly encoded NAME parameter that can safely be extracted with the Inet_QueryParam function, and unencoded
- One ditto NUMBER parameter
- One unencoded NAME parameter that will be neglected by Inet_QueryParam because the latter found the first version of the parameter
- One ditto NUMBER parameter.
This method will double the size of the query parameters, but I don't see why it shouldn't work.
- Oystein -