Extract PDF from OLE (OpenInsight 32-Bit)
At 05 JAN 2006 11:40:58PM Dan Reese wrote:
I am working with an existing MS Access database that stores PDF documents in a column where the data type is OLE Object.
I want to extract this OLE object, then write it to the disk as a PDF file. If I simply write the contents of the field as-is, Adobe Reader says the file is damaged and cannot be repaired.
Does anyone know how to extract the PDF from the OLE object?
At 06 JAN 2006 05:14PM Dan Reese wrote:
After more testing, the pdf file seems to be coming out OK, but it is getting truncated at about 16K.
At 06 JAN 2006 10:43PM Bob Carten wrote:
Dan
I think you need to deal with the BLOBs in chunks.
see the microsoft article ACC: Reading, Storing, & Writing Binary Large Objects (BLOBs)
ADO provides commands to do the same thing.
See
Images in a BLOB Database Field
In Oi 7.12 you can create and use ADO recordsets using the OLEInstance, OleMethod, OleGetProperty and OlePutProperty
see
ADO Recordets via OLE - Update Example
HTH
Bob
At 07 JAN 2006 09:23PM Dan Reese wrote:
Hi Bob,
I think you are right about the need to process this is chunks. I have tried to follow your example, but due to my lack of understanding of ADO, I can't make this work. I can open the connection, create the record set, and interate through the correct number of rows in the table. However, I can't extract anything about the rows. Fields is empty, GetRec returns nothing, etc.
I have included some stripped down code that I put behing the click event in a window. Can you (or anyone) spot what I am doing wrong?
declare subroutine OlePutProperty
declare function OleGetProperty
$Insert Microsoft_Ado_Equates
equ CRLF$ to char(13):char(10)
ok=1
* Create ADO/ODBC connection oper =ADODB.Connection" conn =OleCreateInstance(oper) provider=Microsoft.Jet.OLEDB.4.0" OlePutProperty(conn,"Provider",provider) mdb=C:/MHS/mhs.mdb" x =OleCallMethod(conn, 'Open', mdb) * Create RecordSet command
oper=ADODB.Command"
cmd =OleCreateInstance(oper)
* Create ADO/SQL RecordSet using the command sql=select ReportPDF from MHS_JOBS order by ID" OlePutProperty(cmd, 'ActiveConnection', conn ) OlePutProperty(cmd, 'CommandType', adCmdText ) OlePutProperty(cmd, 'CommandText', sql) oper=ADODB.Recordset" rs =OleCreateInstance(oper) oper=Open" x =OleCallMethod(rs, oper, cmd ) * Extract data from recordSet
oper=MoveFirst'
x =OleCallMethod(rs, oper)
bof =OleGetProperty(rs,'BOF')
if not(bof) then
Display information about the RecordSet loop eof =OleGetProperty(rs, 'EOF') until eof How do I extract anything in this loop? The correc number of records get processed, but Fields does not contain anything GetRec returns nothing oper=MoveNext' x =OleCallMethod(rs, oper) repeat end x=OleCallMethod(rs, 'Close') x=OleCallMethod(conn, 'Close') rs=' RETURN 0 </QUOTE> —- === At 09 JAN 2006 08:32AM Dan Reese wrote: === <QUOTE>I now have a better description of the problem… Does anyone know how to extract a large binary object from Access? I have been attempting to extract a PDF file from a field in a third-party Access database. The Access datatype is OLE Object, but this is just a BLOB. Each document is about 450K. Using the OI/ODBC XOInstance methods (see Example 1 below), each document gets truncated at 16K. This problem occurs in OI 7.0 and 7.2. I am using the Microsoft Access ODBC driver in Windows XP. Using the OleCreateInstance methods and ADO in 7.2, I encounter different problems (see Example 2 below, which describes the various problems). I am unfamiliar with ADO, so I am probably just making a beginner's mistake.Example 1:
mhsJobs=XOInstance('MHS_JOBS') if mhsJobs then query=QryInstance(mhsJobs) if query then statement=select reportpdf from mhs_jobs order by id" flag=QryMethod(query, QRY_EXECUTE$, statement ) row=" loop ok=QryMethod(query, QRY_GETROW$, row) while ok if len(row) then * problem is that len(row)=16K, not 450K pdf=oconv(row,"VB") oswrite pdf on fileSpec end repeat end XOMethod(mhsJobs, XO_DESTROY$) end
Example 2:
$Insert Microsoft_Ado_Equates conn=OleCreateInstance("ADODB.Connection") provider=Microsoft.Jet.OLEDB.4.0" OlePutProperty(conn,"Provider",provider) cmd=OleCreateInstance("ADODB.Command") sql=select ReportPDF from MHS_JOBS order by ID" OlePutProperty(cmd, 'ActiveConnection', conn ) OlePutProperty(cmd, 'CommandType', adCmdText ) OlePutProperty(cmd, 'CommandText', sql) rs=OleCreateInstance("ADODB.Recordset") x=OleCallMethod(rs, "Open", cmd ) x=OleCallMethod(rs, "MoveFirst") bof=OleGetProperty(rs,'BOF') if not(bof) then tooBig=5 rowIndex=0 loop eof=OleGetProperty(rs, 'EOF') until eof This returns 1 byte for each row, always char(13) x=OleCallMethod(rs,"GetString") This executes without error, but returns nothing, and I get errors if I try to print x x=OleGetProperty(rs,"Fields") These end in error x=OleCallMethod(rs,"GetRows") x=OleCallMethod(rs,"GetRows",1,rowIndex) This works, but only for 1 record at a time, BUT!!!!… If the record set contains multiple records, each loop concatenates the pdf in each row into a single document, and it does so for each row. So it creates 1 document for each row in the Recordset where each document size is (documentSize * nbrOfDocuments) !!!!! s=OleCallMethod(rs,"Save,"c:\junk\test":rowIndex:".pdf") This works correctly for GetString, GetRows, Fields, but not for Save x=OleCallMethod(rs, "MoveNext") rowIndex += 1 Needed for Save command, not for others… until rowIndex ge tooBig repeat end x=OleCallMethod(rs, 'Close') x=OleCallMethod(conn, 'Close') What am I am doing wrong? </QUOTE> —- === At 11 JAN 2006 06:19PM Dan Reese wrote: === <QUOTE>For those who may need to use ADO, this is what I was able to make work. Extracting the large binary object would only work using the Save command, and only if the Recordset contained only 1 row and column. Smaller pieces of data worked fine using Fields, Item and Value. Hopefully, this example will save you some time… declare subroutine Set_Property, OlePutProperty declare function OleGetProperty $Insert Microsoft_Ado_Equates elsewhere in forum, from Bob Carten $insert XO_Equates equ CRLF$ to char(13):char(10) * Create Connection connection=OleCreateInstance("ADODB.Connection") provider =Microsoft.Jet.OLEDB.4.0" OlePutProperty(connection,"Provider",provider) mdb=C:/stuff/sample.mdb" x=OleCallMethod(connection, "Open", mdb) gosub displayStatus * Create Command command=OleCreateInstance("ADODB.Command") OlePutProperty(command, "ActiveConnection", connection ) OlePutProperty(command, "CommandType", adCmdText ) commandText=select * from SAMPLE_JOBS order by ID" OlePutProperty(command, "CommandText", commandText) * Create Recordset rs=OleCreateInstance("ADODB.Recordset") OlePutProperty(rs, "CursorType", adOpenKeyset ) OlePutProperty(rs, "CursorLocation", adUseClient ) x=OleCallMethod(rs, "Open", command ) * Move to Beginning of File x=OleCallMethod(rs, "MoveFirst") bof=OleGetProperty(rs,"BOF") if not(bof) then *** Display information about the RecordSet reccount =OleGetProperty(rs, "RecordCount") loop eof =OleGetProperty(rs, 'EOF') until eof /* This works, but only for 1 column and 1 row at a time. For example, this will work if the commandText above is set like this: commandText=select ReportPDF from SAMPLE_JOBS whereSave","c:\junk\test.pdf") */ /* On the other hand, if you want to extract normal text data of reasonable size using ADO, you can do it like this: fields=OleGetProperty(rs, "Fields") field=OleGetProperty(fields, "Item", columnIndex) columnIndex is zero-based value=OleGetProperty(field, "Value") x=OleCallMethod(rs, "MoveNext") */ repeat end x=OleCallMethod(rs, "Close") x=OleCallMethod(connection, "Close") </QUOTE> —- === At 18 JAN 2006 05:13PM Burke Mamlin wrote: === <QUOTE>I've run into the same problem, connecting through ODBC to a MySQL database. I can insert large records (]64K) from OI into MySQL, but when I try to read the same record back into OI it is truncated to 16K (16,384 bytes). For example: I create an empty table in MySQL named "test" (e.g., "create table 'test' ('id' varchar(255), 'record' mediumblob)") and an ODBC database connection in OI called "TEST_DB_CONN". Then I run the script below (I've left out typical error checking). This creates an 80,000 byte record in MySQL; but when I try to read that same record back into OI, I only get 16K. I've tried looping around the QRY_GETROW$, but that doesn't help. Is this an inherit limitation of the ODBC database connection? If so, is there a workaround? Perhaps I've missed some documentation on how to get rows through an ODBC XOInstance ]16K? Any help would be greatly appreciated. Thanks. -Burke —— sample script minus error checking —– function odbc_tester(VOID) $insert XO_Equates hXO=XOInstance("TEST_DB_CONN", "ODBC", "test", "test", "", XO_NOTRANS$ + XO_GLOBALSHARE$ + XO_NODIALOG$ + XO_NORESOLVE$) hQry=QryInstance(hXO) * insert 80,000 bytes into a record sql =insert into test (id, record) " sql := "values ('test', '":str("X", 80000):"');" QryMethod(hQry, QRY_EXECUTE$, sql) * Now test table contains row (id=test",record=80,000 X's) * Try to get it back… sql=select record from test where;" QryMethod(hQry, QRY_EXECUTE$, sql) QryMethod(hQry, QRY_GETROW$, row) QryMethod(hQry, QRY_DESTROY$) XOMethod(hXO, XO_DESTROY$) * Should return 80,000 but we only get 16,384 :o( return len(row) </QUOTE> —- === At 18 JAN 2006 09:02PM DSig wrote: === <QUOTE>as a simple check on the ODBC driver, have you tried to bring the data into excel? </QUOTE> —- === At 19 JAN 2006 09:57PM Burke Mamlin wrote: === <QUOTE>I created a simple Excel file with a small table containing a value 32K bytes long (the longest Excel allows). I was able to read the row just fine using Java via the ODBC connection. I then used the same ODBC connection to the spreadsheet within OpenInsight to read the value. Lo and behold, I only get 16,384 bytes. Dan described getting data chopped at 16K when reading through an ODBC XOInstance connection to MS Access. I've seen the same behavior in MySQL and again in Excel. It looks like a limitation of the ODBC connection in OpenInsight, right? We could *really* benefit from reliable access via ODBC from OI to other databases; however, if values read over 16K are silently truncated…well, that's not particularly desirable. As I mentioned earlier, the other direction (writing data from OpenInsight into an external database via ODBC) works great – I've written 800K records without a problem. So, back to the question… QryMethod(hQry, QRY_GETROW$, Row) through an ODBC connection seems to be silently truncating values at 16K. How can I use an ODBC XOInstance to read values bigger than 16,384 bytes in length? Thanks for any help you can provide, -Burke </QUOTE> —- === At 19 JAN 2006 11:35PM Dsig (SigSolutions) wrote: === <QUOTE>though this is one for Mike and crew .. from my point of view .. yes this is a limitation that needs to be corrected. dsig </QUOTE> —- === At 19 JAN 2006 11:37PM Dsig (SigSolutions) wrote: === <QUOTE>I forgot to ask .. are you using datasets or direct through xo functions? thanks </QUOTE> —- === At 20 JAN 2006 08:15AM Mike Ruane wrote: === <QUOTE>Burke et al- I've duplicated the problem and corrected it- the code in QryMethod is as follows: <code> ColCount=0 RowCount=0 Flag =XODescribeResults(hQry, ColCount, RowCount) * mtr 1-20-06 Column =str(\00\, "16384") Column =str(\00\, "1024000") ValLen =0 * fetch next row Flag=XOFetch(hQry, Direction) gosub TranslateFlag_Sub if Flag then * retrieve each column in the row for iCol=1 to ColCount Flag=XOGetColumn(hQry, iCol, Column, len(Column), ValLen, Conversion) gosub TranslateFlag_Sub while Flag if ValLen ": @fm end else Row := Column 1,ValLen: @fm end next iCol Row -1,1=" end </code> I'll roll this out into the 7.2.1 build, but I don't know any other bad effects that it may have. Sig, I can open the beta to you if you'd like to test. Mike Ruane </QUOTE> —- === At 20 JAN 2006 11:57AM Burke Mamlin wrote: === <QUOTE>Great! Thanks Mike. So the new max will be 1MB? It'd be nice to eliminate an arbitrary max (e.g., get the size needed before allocating a buffer or, for a faster solution, pass the buffer size and return a flag if there is more data than could be buffered on the first pass to force a loop only on big values when needed). If that's not going to happen, consider moving this setting into a configuration settings somewhere, so it can be adjusted without rebuilding the codebase. But we can survive with a 1MB max for now. Any chance I could get a copy (bmamlin at regenstrief dot org) of your patched $QRYMETHOD for testing until 7.2.1 is out? Thanks again for your quick response! -Burke </QUOTE> —- === At 20 JAN 2006 01:30PM Mike Ruane wrote: === <QUOTE>Sent </QUOTE> —- === At 20 JAN 2006 07:28PM dsig _at_ sigafoos.org wrote: === <QUOTE>Would love to .. dsig </QUOTE> —- === At 20 JAN 2006 07:32PM dsig _at_ sigafoos.org wrote: === <QUOTE>forgot .. the only problem i see is that if blobs are being stored .. it is possible that bmps etc exceed 1 meg .. the question would be .. IF you can get the required size back will it slow down the process to be param driven. shouldn't should it? dsig </QUOTE> —- === At 23 JAN 2006 03:57PM Burke Mamlin wrote: === <QUOTE>Thanks for the updated $QRYMETHOD. Unfortunately, after updating SYSOBJ $QRYMETHOD and its SYSREPOS record, OI crashes whenever I call QRYMETHOD with a QRY_GETROW$ (no matter what I'm trying to read). We're using 7.1.0 at the moment. Have there been other changes to QRY_METHOD since 7.1.0? DSig - have you been able to read ]16K records through ODBC with this updated QRY_METHOD? While overcoming the 16K limit improves things, we could really use a solution that handles ]1 megabyte blobs. Could XOGetColumn return a "Flag" value indicating current column exceeds the provided buffer size? Or use another parameter to do this? Allowing QRYMETHOD to loop on XOGetColumn until the column is fully read? -Burke </QUOTE> —- === At 26 JAN 2006 10:01AM DSig (SigsSolutions) wrote: === <QUOTE>Never got the update </QUOTE> —- === At 26 JAN 2006 12:16PM Mike Ruane wrote: === <QUOTE>Looks like a 7.2.1 incompatibility- grab latest from the beta site this afternoon. Thanks </QUOTE> View this thread on the Works forum...