Sign up on the Revelation Software website to have access to the most current content, and to be able to ask questions and get answers from the Revelation community

At 17 NOV 2005 05:08:39PM Adam Fox wrote:

Can anyone tell me an easy way of stripping html tags from a string? Are there any equivalent tools to regular expressions etc that I might use to accomplish this? I am working on a legacy application (maintenance) and I need to reliably strip out HTML tags from a string. The existing code is trying to do this but some tags slip through causing the application to crash out.

I know this can be quite easily accomplished in Perl, PHP and Javascript for example and was wondering if any of the Gurus on here might know a reliable way to do it in Basic+?

Thanks in advance,

Adam


At 17 NOV 2005 05:43PM psimonsen@srpcs.com's Paul Simonsen wrote:

Adam,

One method would be to get the length of the string and loop through it character by character, copying each character from the original html string to another variable. If you come across a "" character. That way your new variable will not have any html tags.

However, if you need to trap data that is within those tags, you'll have to code around that. For example, many times you'll have data in value=data".

Hope this helps,

psimonsen@srpcs.com

SRP Computer Solutions, Inc.


At 17 NOV 2005 07:50PM dsig _at_ sigafoos.org wrote:

Adam ..

Now i am just blowing this out .. but

you might be able to find a script regex which works with the microsoft scripting engine. I believe that bob carten mentioned that OI can use the scripting engine ..

Just a (partially formed) thought

dsig


At 18 NOV 2005 02:55AM Adam Fox wrote:

Hmmm, now that sounds interesting. I was wondering if there would be any way to use regexpressions. Thanks for the suggestion re parsing through the string but the original solution does this and it has proven to let stuff slip through the gap; so yes, thee is a fair amount of coding around that. Trouble is being new to OI I'm not aware of anything that could make the job easier.

Nothing like pattern matching or regular expressions native to OI then?

Adam


At 18 NOV 2005 02:53PM Bob Carten wrote:

No Regular expressions, however very powerful string operations

with square bracket, indexc and swap functions you can do well.

A nice way to use vbscript or javascript from OI 7.2 is to use the windows scripting component. There is a nice article here

you can create a component as a plain text file, say myComponent.wsc

use regsvr32 to register it then in OI you can use something like

myObj=OleCreateInstance("MyComponent.wsc")

OlePutProperty(MyObj, 'MyString', mystring)

OlePutProperty(MyObj, 'MyPattern', myPattern)

match=OleCallMethod(MyObj, 'Match')

In earlier versions you can put the WSC object in an OLE control on a window, use Set_Porerty, Send_MEssage and Get_Property to drive it.

HTH

Bob


At 23 NOV 2005 03:16PM Mark Glicksman wrote:

I've written a function to strip HTML - seems to work pretty well. You might be able to adapt it:

Function Strip_HTML_Tags(InputText)

*Revised 1/31/05

*

*Function strips the html tags, scripts, etc. out of InputText, and returns just plain text.

*

Declare Function Count_MultiVal, UCase

*

ReturnText=" ;*initialize return value

Hold=InputText

swap " " with " " in Hold

swap " " with " " in Hold

swap "

" with " " in Hold

swap "

" with " " in Hold

swap "" with " " in Hold

swap "" with " " in Hold

Swap @tm with "" in Hold

Hold2=Hold

Swap "

Swap ]" with @fm:" " in TextPiece
  • Ignore scripts and style statements
Begin Case
Case UCase(TextPiece)1,5=TITLE"
  ReturnText="
Case UCase(TextPiece)1,5=STYLE"
  ReturnText="
Case UCase(TextPiece)1,6=SCRIPT"
  ReturnText="
Case 1
  ReturnText=TextPiece
End Case

Next I

Begin Case

Case Count_MultiVal(Hold)=1

ReturnText=Hold2

Case 1

Swap @vm with "" in ReturnText

End Case

*

Gosub Swap_Special_Chars

*

*

EndSub:

Return Trim(ReturnText)

*

*

*

Swap_Special_Chars:

*Swap the codes for special characters with the actual characters

Swap "&nbsp " with char(160):" " in ReturnText

Swap "&iexcl " with char(161):" " in ReturnText

Swap "&cent " with char(162):" " in ReturnText

Swap "&pound " with char(163):" " in ReturnText

Swap "&curren " with char(164):" " in ReturnText

Swap "&yen " with char(165):" " in ReturnText

Swap "&brvbar " with char(166):" " in ReturnText

Swap "&sect " with char(167):" " in ReturnText

Swap "&uml " with char(168):" " in ReturnText

Swap "&copy " with char(169):" " in ReturnText

Swap "&ordf " with char(170):" " in ReturnText

Swap "&laquo " with char(171):" " in ReturnText

Swap "&not " with char(172):" " in ReturnText

Swap "&shy " with char(173):" " in ReturnText

Swap "&reg " with char(174):" " in ReturnText

Swap "&macr " with char(175):" " in ReturnText

Swap "&deg " with char(176):" " in ReturnText

Swap "&plusmn " with char(177):" " in ReturnText

Swap "&sup2 " with char(178):" " in ReturnText

Swap "&sup3 " with char(179):" " in ReturnText

Swap "&acute " with char(180):" " in ReturnText

Swap "&micro " with char(181):" " in ReturnText

Swap "&para " with char(182):" " in ReturnText

Swap "&middot " with char(183):" " in ReturnText

Swap "&cedil " with char(184):" " in ReturnText

Swap "&sup1 " with char(185):" " in ReturnText

Swap "&ordm " with char(186):" " in ReturnText

Swap "&raquo " with char(187):" " in ReturnText

Swap "&frac14 " with char(188):" " in ReturnText

Swap "&frac12 " with char(189):" " in ReturnText

Swap "&frac34 " with char(190):" " in ReturnText

Swap "&iquest " with char(191):" " in ReturnText

Swap "&Agrave " with char(192):" " in ReturnText

Swap "&Aacute " with char(193):" " in ReturnText

Swap "&Acirc " with char(194):" " in ReturnText

Swap "&Atilde " with char(195):" " in ReturnText

Swap "&Auml " with char(196):" " in ReturnText

Swap "&Aring " with char(197):" " in ReturnText

Swap "&AElig " with char(198):" " in ReturnText

Swap "&Ccedil " with char(199):" " in ReturnText

Swap "&Egrave " with char(200):" " in ReturnText

Swap "&Eacute " with char(201):" " in ReturnText

Swap "&Ecirc " with char(202):" " in ReturnText

Swap "&Euml " with char(203):" " in ReturnText

Swap "&Igrave " with char(204):" " in ReturnText

Swap "&Iacute " with char(205):" " in ReturnText

Swap "&Icirc " with char(206):" " in ReturnText

Swap "&Iuml " with char(207):" " in ReturnText

Swap "&ETH " with char(208):" " in ReturnText

Swap "&Ntilde " with char(209):" " in ReturnText

Swap "&Ograve " with char(210):" " in ReturnText

Swap "&Oacute " with char(211):" " in ReturnText

Swap "&Ocirc " with char(212):" " in ReturnText

Swap "&Otilde " with char(213):" " in ReturnText

Swap "&Ouml " with char(214):" " in ReturnText

Swap "&times " with char(215):" " in ReturnText

Swap "&Oslash " with char(216):" " in ReturnText

Swap "&Ugrave " with char(217):" " in ReturnText

Swap "&Uacute " with char(218):" " in ReturnText

Swap "&Ucirc " with char(219):" " in ReturnText

Swap "&Uuml " with char(220):" " in ReturnText

Swap "&Yacute " with char(221):" " in ReturnText

Swap "&THORN " with char(222):" " in ReturnText

Swap "&szlig " with char(223):" " in ReturnText

Swap "&agrave " with char(224):" " in ReturnText

Swap "&aacute " with char(225):" " in ReturnText

Swap "&acirc " with char(226):" " in ReturnText

Swap "&atilde " with char(227):" " in ReturnText

Swap "&auml " with char(228):" " in ReturnText

Swap "&aring " with char(229):" " in ReturnText

Swap "&aelig " with char(230):" " in ReturnText

Swap "&ccedil " with char(231):" " in ReturnText

Swap "&egrave " with char(232):" " in ReturnText

Swap "&eacute " with char(233):" " in ReturnText

Swap "&ecirc " with char(234):" " in ReturnText

Swap "&euml " with char(235):" " in ReturnText

Swap "&igrave " with char(236):" " in ReturnText

Swap "&iacute " with char(237):" " in ReturnText

Swap "&icirc " with char(238):" " in ReturnText

Swap "&iuml " with char(239):" " in ReturnText

Swap "&eth " with char(240):" " in ReturnText

Swap "&ntilde " with char(241):" " in ReturnText

Swap "&ograve " with char(242):" " in ReturnText

Swap "&oacute " with char(243):" " in ReturnText

Swap "&ocirc " with char(244):" " in ReturnText

Swap "&otilde " with char(245):" " in ReturnText

Swap "&ouml " with char(246):" " in ReturnText

Swap "&divide " with char(247):" " in ReturnText

Swap "&oslash " with char(248):" " in ReturnText

Swap "&ugrave " with char(249):" " in ReturnText

Swap "&uacute " with char(250):" " in ReturnText

Swap "&ucirc " with char(251):" " in ReturnText

Swap "&uuml " with char(252):" " in ReturnText

Swap "&yacute " with char(253):" " in ReturnText

Swap "&thorn " with char(254):" " in ReturnText

Swap "&yuml " with char(255):" " in ReturnText

*

Swap " " with char(160) in ReturnText

Swap "¡" with char(161) in ReturnText

Swap "¢" with char(162) in ReturnText

Swap "£" with char(163) in ReturnText

Swap "&curren " with char(164) in ReturnText

Swap "¥" with char(165) in ReturnText

Swap "¦" with char(166) in ReturnText

Swap "§" with char(167) in ReturnText

Swap "¨" with char(168) in ReturnText

Swap "©" with char(169) in ReturnText

Swap "ª" with char(170) in ReturnText

Swap "«" with char(171) in ReturnText

Swap "¬" with char(172) in ReturnText

Swap "­" with char(173) in ReturnText

Swap "®" with char(174) in ReturnText

Swap "¯" with char(175) in ReturnText

Swap "°" with char(176) in ReturnText

Swap "±" with char(177) in ReturnText

Swap "²" with char(178) in ReturnText

Swap "³" with char(179) in ReturnText

Swap "´" with char(180) in ReturnText

Swap "µ" with char(181) in ReturnText

Swap "¶" with char(182) in ReturnText

Swap "·" with char(183) in ReturnText

Swap "¸" with char(184) in ReturnText

Swap "¹" with char(185) in ReturnText

Swap "º" with char(186) in ReturnText

Swap "»" with char(187) in ReturnText

Swap "¼" with char(188) in ReturnText

Swap "½" with char(189) in ReturnText

Swap "¾" with char(190) in ReturnText

Swap "¿" with char(191) in ReturnText

Swap "À" with char(192) in ReturnText

Swap "Á" with char(193) in ReturnText

Swap "Â" with char(194) in ReturnText

Swap "Ã" with char(195) in ReturnText

Swap "Ä" with char(196) in ReturnText

Swap "Å" with char(197) in ReturnText

Swap "Æ" with char(198) in ReturnText

Swap "Ç" with char(199) in ReturnText

Swap "È" with char(200) in ReturnText

Swap "É" with char(201) in ReturnText

Swap "Ê" with char(202) in ReturnText

Swap "Ë" with char(203) in ReturnText

Swap "Ì" with char(204) in ReturnText

Swap "Í" with char(205) in ReturnText

Swap "Î" with char(206) in ReturnText

Swap "Ï" with char(207) in ReturnText

Swap "Ð" with char(208) in ReturnText

Swap "Ñ" with char(209) in ReturnText

Swap "Ò" with char(210) in ReturnText

Swap "Ó" with char(211) in ReturnText

Swap "Ô" with char(212) in ReturnText

Swap "Õ" with char(213) in ReturnText

Swap "Ö" with char(214) in ReturnText

Swap "×" with char(215) in ReturnText

Swap "Ø" with char(216) in ReturnText

Swap "Ù" with char(217) in ReturnText

Swap "Ú" with char(218) in ReturnText

Swap "Û" with char(219) in ReturnText

Swap "Ü" with char(220) in ReturnText

Swap "Ý" with char(221) in ReturnText

Swap "Þ" with char(222) in ReturnText

Swap "ß" with char(223) in ReturnText

Swap "à" with char(224) in ReturnText

Swap "á" with char(225) in ReturnText

Swap "â" with char(226) in ReturnText

Swap "ã" with char(227) in ReturnText

Swap "ä" with char(228) in ReturnText

Swap "å" with char(229) in ReturnText

Swap "æ" with char(230) in ReturnText

Swap "ç" with char(231) in ReturnText

Swap "è" with char(232) in ReturnText

Swap "é" with char(233) in ReturnText

Swap "ê" with char(234) in ReturnText

Swap "ë" with char(235) in ReturnText

Swap "ì" with char(236) in ReturnText

Swap "í" with char(237) in ReturnText

Swap "î" with char(238) in ReturnText

Swap "ï" with char(239) in ReturnText

Swap "ð" with char(240) in ReturnText

Swap "ñ" with char(241) in ReturnText

Swap "ò" with char(242) in ReturnText

Swap "ó" with char(243) in ReturnText

Swap "ô" with char(244) in ReturnText

Swap "õ" with char(245) in ReturnText

Swap "ö" with char(246) in ReturnText

Swap "÷" with char(247) in ReturnText

Swap "ø" with char(248) in ReturnText

Swap "ù" with char(249) in ReturnText

Swap "ú" with char(250) in ReturnText

Swap "û" with char(251) in ReturnText

Swap "ü" with char(252) in ReturnText

Swap "ý" with char(253) in ReturnText

Swap "þ" with char(254) in ReturnText

Swap "ÿ" with char(255) in ReturnText

*

For I=160 to 255

Swap "&#" : I : " " with Char(I) : " " in ReturnText
Swap "&#" : I : ";" with Char(I) in ReturnText

Next I

Return

View this thread on the forum...

  • third_party_content/community/commentary/forums_nonworks/fc8a3240c7ad435d852570bc0079a49b.txt
  • Last modified: 2023/12/28 07:39
  • by 127.0.0.1