Client Server Corner - SQL Server Data Loading Part II - Bill Wolff
Published By | Date | Version | Knowledge Level | Keywords |
---|---|---|---|---|
Sprezzatura Ltd | 01 JUN 1992 | 2.12+ | EXPERT | SQL, SERVER, MSSQL_C, MSSQL_C_DAT, BCP |
MSSQL_C Really Does the Work
The same principles can be applied at an even lower level of the bond technology. Ultimately, all access to SQL Server is converted to MSSQL_C calls. This is the C program that links Arev to the network named pipes facility. This takes more work for the programmer but much less for the bond. Connections and data segments must be handled directly in the program but can be released when processing is complete. This frees resources in Arev and in SQL Server which might improve overall performance on a loaded LAN. The source code is littered with MSSQL_C calls:
/* Wolff Data Systems Arev copy with mssql example MSSQL_C calls to insert each record 16 minutes */ declare subroutine mssql_c open 'MASTER' to f_master then * make connection mssql_c_dat = xlate( 'VERBS', 'MSSQL_C_DAT', '', 'X') mssql_c( 1, errflag, errstr, mssql_c_dat) mssql_c(2,'SERVER','sa','','WDS_TEST','OPTI_386',dbprocess, errflag, errstr, mssql_c_dat) mssql_c(4,dbprocess,'usetest',errflag,errstr,mssql_c_dat) * read records select f_master done = '' count = 0 p_dos = 0 error = '' call dostime( start) loop until done count += 1 readnext k_master then read r_master from f_master, k_master then r_master_sql = '(':k_master:',"':field(r_master,@fm,1,7) r_master_sql< 8> = r_master< 8, 1> r_master_sql< 9> = r_master< 8, 2> r_master_sql< 10> = r_master< 8, 3> r_master_sql< 11> = field( r_master, @fm, 9, 3) r_master_sql< 14> = oconv( r_master< 12>, 'D4-') r_master_sql< 15> = oconv( r_master< 13>, 'D4-') r_master_sql< 16> = r_master< 15> r_master_sql< 17> = r_master< 16>: '")' swap @fm with '","' in r_master_sql r_master_sql[ 1, 0] = 'insert into master_sql values ' mssql_c(4,dbprocess,r_master_sql,errflag,errstr, mssql_c_dat) print @( 0, 21): count 'R#5': if errflag then error< -1> = errflag: ',': errstr end end else done = 1 repeat call dostime( finish) call msg( finish - start, '', '', '') oswrite error on 'error.sql' * close connection mssql_c( 14, dbprocess, errflag, errstr, mssql_c_dat) end
Equate statements can be used in place of the integer in the first MSSQL_C parameter. Only about 10 calls are used routinely so the numbers are easy to remember. Line 11 grabs the data segment from VERBS. Line 12 checks that the DBNMPIPE network named pipe utility is functioning. Line 13 connects to the server and line 14 sets the current database. The SQL INSERT statements are built the same as the previous example and match the syntax normally generated by the bond. Error handling is slightly different because MSSQL_C returns ample information about server connection problems. These errors are coded in errflag and explanatory text is stored in errstr. Line 46 closes the connection and frees the associated Arev and server memory.
Note that the DBPROCESS and MSSQL_C_DAT data segment variables are necessary for most of these calls. It is possible to have multiple connections open to the SQL Server. This is often done for routines that read the results of a complicated query and update individual rows from that query. Client server projects that rely heavily on MSSQL_C for performance often store open connection variables in labelled commons. This saves a few steps in each program and further improves performance at the expense of reserved server memory (only 42K per connection). This method gets down to the lowest level but requires the highest level of programmer proficiency. The test file took only 16 minutes which is 25% of the bonded file method. This approach is applicable to other Arev bonding tasks like complicated entry windows and custom popups.
SQL Server Bulk Copy
The last method requires an extra step outside of Arev but performs the best. The source records are parsed into text strings with each field separated by a tab (\09\) character. The lines are carriage return/line feed (\0A0D\) terminated. This format is compatible with the SQL Server BCP utility which is used for data loading. The code runs very fast since we do not wait for a response from SQL Server for each row. The combined time for Arev and BCP is only 8 minutes which is half that of MSSQL_C. There are also no server connections established from Arev. The code follows:
/* Wolff Data Systems Arev to bulkcopy example Build text file with formatted data and call BCP to import into table 5 minutes in Arev, 3 minutes in BCP */ open 'MASTER' to f_master then oswrite '' on 'c:\sql\master.sql' osopen 'c:\sql\master.sql' to f_dos then select f_master done = '' count = 0 p_dos = 0 error = '' call dostime( start) loop until done count += 1 print @( 0, 21): count: readnext k_master then read r_master from f_master, k_master then r_master_sql = k_master:\09\:field(r_master,@fm,1,7) r_master_sql< 8> = r_master< 8, 1> r_master_sql< 9> = r_master< 8, 2> r_master_sql< 10> = r_master< 8, 3> r_master_sql< 11> = field( r_master, @fm, 9, 3) r_master_sql< 14> = oconv( r_master< 12>, 'D4-') r_master_sql< 15> = oconv( r_master< 13>, 'D4- ') r_master_sql< 16> = field(r_master,@fm,15,2):\0D0A\ convert @fm to \09\ in r_master_sql osbwrite r_master_sql on f_dos at p_dos p_dos += len( r_master_sql) end end else done = 1 repeat call dostime( finish) call msg( finish - start, '', '', '') oswrite error on 'error.sql' ;* what's this? osclose f_dos * import records, assumes file already exists on server suspend 'bcp test..master_sql in c:\sql\master.sql /c /U sa /P /S server /m 1000' end end
Line 40 is the bulk copy utility which is in the DOS path. BCP has numerous parameters that allow it to support various input formats. ASCII text, CSV, and internal data types are supported. An input script details the file layout and specific field data types. This script is not necessary if the data is loaded in character (/c) format which is tab separated strings. In this case, SQL Server converts each field into the proper internal format. BCP is often used to transfer data from one SQL Server to another where internal data formats are normally maintained. The other parameters are explained below:
test..master_sql | full table name including database in data flow direction, could be out |
c:\sql\master.sql | DOS data file created in Arev |
/c | character data, separate with tabs |
/U sa | user login id |
/P | password |
/S server | server name |
/m 1000 | number of errors to allow |
The errors can be written to a text file with the /e option. This log can be studied for data problems. The /b batchsize parameter is useful with very large imports because it commits rows at a specified interval. Otherwise, all rows must be read with less errors than specified with /m or the entire load is rejected. This transaction approach to data loading is central to the security and reliability that makes SQL Server so popular.
One other note is worth discussing. With many data files to process, consider running the Arev ASCII export on one machine and the BCP on another. Better yet, run BCP directly on the OS/2 system hosting SQL Server. The data will load much quicker when local pipes are used instead of named pipes (no network traffic).
Conclusion
All four data loading methods arrive at the same result. The data is transferred to SQL Server and ready for query and application processing. With a little work, the load time can be reduced dramatically. There are other issues like transaction logging that might effect your implementation. Run a few tests and develop your own approach. If you have questions, post them on Compuserve in the OS2/SQL section of the Revelation forum.
(Volume 4, Issue 2, Pages 11-13)